WORLD INTELLECTUAL ¥PJ^ H^TY OROANIZATKOM 

International ,'3fiR^ 



PCT^ 

INTERNATIONAL APPUCATION PUBUSHED UNDER THE PATENT COOPEEIATION TREATY (PCT) 

(11) International PoblEcatlon Nnmben WO 95/27080 

(43) Internatiooal Publication Dale: 12 October 1995 (12.10.95) 



(51) Intmational Patent aasstfication ^ : 
C12Q 1/68, C12P 19^4, C12Q 1/34 



A2 



121) Inteniational Application Nnmber: PCT/US95/Q3678 
(22) International FiUng Date: 24 Maich 1995 (24.03.95) 



(30) Priority Data: 
08/222,300 
08/280,441 



4 i^ril 1994 (04.04.94) 
25 July 1994 (25.07.94) 



US 
US 



(71) Applicant: LYNXTBERAPEUUCS, INC fUS/USl; 3632 Bay 

CJenter Place, Haywaid. CA 94545 (US). 

(72) fnventor. BRENNER, Sydney; University of Cambridge. 

School of Clinical Medicine. Addenbrooke'a Hospital. Level 
5, HOIS Road, Cambridge CB2 QQ (GB). 

CI4i Agent: MACEVICZ, Stephen, C; Lynx TTier^eutics. Inc., 
3832 Bay Center Place, Haywaid, CA 94545 (US). 



(81) Designated States: AU. CA, JP, SO. Euiq?ean patent (AT. BE. 

CH, DE, DK. ES, FR. OT . OT. IE, rr. LU. MC NL. FT. 

SB). 



Published 

Without international search report and to be republished 
upon receipt of that report. 




(54) Title: DNA SEQUENCING BY STEPWISE UGATION AND CLEAVAGE 
(57) Abstract 

A- invention provides a method of nucleic acid sequence analysis based on repeated cycles of Hgation to and cleavage of probes at 
the termmus of a target polynucleotide. At each such cycle one or more terminal nudeotid^ are identified and oLTmf ^ nu^^^^^ 
T polynucleoUde, such that further cycles of ligation and cleavage can lake place. At each cycle 

StS^l^S^te^^ ^LTV' '^"^If^^d^^ "Jf f ^^^^^ sequence of the target polynucleoUde is detennined.^e 

^T*^^^ separation of simil^y sized DNA ftagments and eliminates the difficulUes associated with the delecUon 
and analysis of spaOaUy ovcriappmg bands of DNA fragments in a gel, or like medium. n,e invention further obviates the need to 
DNA fragmentt from long single stranded templates with a DNA polymerase. gwicraic 
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DNA SEOTIENCINO RV STRP^T^ 
LIGATION AND rT.Fjyv^CB; 



5 Field of the InvfiT^t^9^ 

The invention relates generally to methods for determining the nucleotide 
sequence of a polynucleotide, and more particularly, to a method of step- wise removal 
and identification of terminal nucleotides of a polynucleotide. 

10 BACKGRQTINn 

Analysis of polynucleotides with currenfly available techniques provides a 
spectrum of inf onnation ranging from Uie confirmation that a test polynucleotide is the 
same or different flian a standard sequence or an isolated fragment to the express 
identification and ordering of each nucleoside of the test polynucleotide. Not only are 

1 5 such techniques crucial for understanding tiie function and control of genes and for 
applying many of die basic techniques of molecular biology, but they have also become 
increasingly important as tools in genomic analysis and a great many non-research 
applications, such as genetic identification, for^sic analysis, genetic counseling, medical 
diagnostics, and tiie like. In these latter applications bodi techniques providing partial 

20 sequence information, such as fingerprinting and sequence comparisons, and techniques 
providing full sequence determination have been employed, e.g. Gibbs et al. Proc. NaU. 
Acad. Sci., 86: 1919-1923 (1989); Gyllensten et al, Proc. Nad. Acad. Sci, 85: 7652- 
7656 (1988); Carrano et al. Genomics, 4:129-136 (1989);Caetano-Anolles et al, Mol. 
Gen. Genet, 235: 157-165 (1992); Brenner and Livak. Proc. Nad, Acad. Sci., 86: 

25 8902-8906 (1989); Green et al. PGR Methods and Applications, 1: 77-90 (1991); and 
Versalovic et al, Nucleic Acids Research, 19: 6823-6831 (1991). 

Native DNA consists of two linear polymers, or strands of nucleotides. Each 
strand is a chain of nucleosides linked by phosphodiester bonds. The two strands are 
held together in an antiparallel orientation by hydrogen bonds between complementary 

30 bases of the nucleotides of die two strands: deoxyadenosine (A) pairs witfi thymidine 
(T) and deoxyguanosine (G) pairs with deoxycytidine (C). 

Presentiy diere are two basic approaches to DNA sequence determination: die 
dideoxy chain termination method, e.g. Sanger et al, Proc. Nati. Acad. Sci., 74: 5463- 
5467 (1977); and the chemical degradation mediod, e.g. Maxam et al, Proc. Nad. Acad. 

35 Sci., 74: 560-564 (1977). Ihe chain termination method has been improved in several 
ways, and serves as the basis for all cunrentiy available automated DNA sequencing 
machines, e.g. Sanger et al, J. Mol. Biol.. 143: 161-178 (1980); Schreier et al. J; Mol. 
Biol., 129: 169-172 (1979); Smitii et al. Nucleic Acids Research, 13: 2399-2412 (1985); 
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Smith et al, Natme, 321: 674-679 (1987); Prober et al, Science, 238: 336-341 (1987); 
Section n, Meth. EnzymoL, 155: 51-334 (1987); Church et al. Science, 240: 185-188 
(1988); Hunkapiller et al. Science, 254: 59-67 (1991); Bevan et al, PGR Methods and 
Applications, 1: 222-228 (1992). 
5 Bodi the chain tennination and chemical degradation methods require the 

generation of one or more sete of labeled DNA fragments, each having a common origin 
and each terminating with a known base. . The set or sets of fiagmcnts must then be 
separated by size to obtain sequence information. In both methods, the DNA fragments 
aie separated by high resolution gel electrophoresis, which must have the capacity of 
10 distinguishing very large fiagments differing in size by no more than a single nucleotide. 
Unfortunately, this step severely limits the size of tiie DNA chain that can be sequenced 
at one time. Sequencing using these techniques can reliably accommodate a DNA 
chain of up to about 400-450 nucleotides, Bankier et al, Meth. EnzymoL, 155: 51-93 
(1987); and Hawkins et al. Electrophoresis, 13: 552-559 (1992). 

Several significant technical problems have sariously impeded the application of 
such techniques to the sequencing of long target polynucleotides. e.g. in excess of 500- 
600 nucleotides, or to the sequencing of high volumes of many target polynucleotides. 
Such problems include i) tiie gel electrophoretic separation step which is labor intensive, 
is difficult to automate, and introduces an exura degree of variability in die analysis of 
data, e.g. band broadening due to temperature effects, compressions due to secondary 
stnicuire in tiie DNA sequencing fragments, inhomogeneities in tiie separation gel. and 
die like; u) nucleic acid polymerases whose properties, such as processivity, fideUty, rate 
of polymerization, rate of incorporation of chain terminators, and Uie like, are often 
sequence dependent; iii) detection and analysis of DNA sequencing fragments which are 
typically present in finol quantities in spatially overlapping bands in a gel; iv) lower 
signals because tiie labeling moiety is distributed over tiie many hundred spatially 
separated bands ratiier tiian being concentrated in a single homogeneous phase, and v) 
in tiie case of single-lane fluorescence detection, tiie availability of dyes witfi suitoble 
emission and absorption properties, quanbim yield, and spectral resolvabUity, e.g. 
Trainor. Anal. Biochem., 62: 418-426 (1990); ConneU et al. Biotechniques. 5: 342-348 
(1987); Karger et al. Nucleic Acids Research, 19: 4955-4962 (1991); Fung et al, U.S. 
patent 4,855,225; and Nishikawa et al. Electrophoresis, 12: 623-631 (1991). 

Anotiier problem exists witii current technology in die area of diagnostic 
sequencing. An ever widening array of disorders, susceptibilities to disorders, 
35 prognoses of disease conditions, and tiie like, have been correlated witii tiie presence of 
particular DNA sequences, or die degree of variation (or mutation) in DNA sequences, 
at one or more genetic locL Examples of such phenomena include human leukocyte 
antigen (HLA) typing, cystic fibrosis, tumor progression and heterogeneity, p53 proto- 
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oncogene mutations, ras proto-oncogene mutations, and the like, e.g. Gyllensten et al, 
PGR Methods and AppUcations, 1: 91-98 (1991); Saniamaria et al, International 
application PCT/US92/01675; Tsui et al, International application PCT/CA90/00267; 
and the like. A difficulty in determining DN A sequences associated with such 

5 conditions to obtain diagnostic or prognostic information is die fiequent presence of 
niuldple subpopuladons of DNA, e.g, allelic variants, multiple mutant forms, and the 
like. Distinguishing the presence and identity of multiple sequences with current 
sequencing technology is virtuaUy impossible, widiout additional work to isolate and 
periiaps clone the separate species of DNA. 

10 A major advance in sequencing technology could be made if an alternative 

approach was avaUable for sequencing DNA that did not required high resolution 
separations, provided signals more amenable to analysis, and provided a meaiis for 
readily analyzing DNA from heterozygous genetic loci. 

15 Summary o f the Inventinn 

The invention provides a method of nucleic acid sequence analysis based on 
ligation and cleavage of probes at the terminus of a target polynucleotide. Preferably, 
repeated cycles of such Ugation and cleavage are implemented in the method, and in 
each such cycle a nucleotide is identified at die end of the target polynucleotide and die 

20 target polynucleotide is shortened, such Oiat further cycles of ligation, cleavage, and 
identification can take place. That is, preferably, in each cycle die target sequence is 
shortened by a single nucleotide and die cycles are repeated until die nucleotide 
sequence of the target polynucleotide is determined. 

An important feature of die invention is die probe employed in die ligation and 

25 cleavage events. A probe of die invention is a double stranded polynucleotide which (i) 
contains a recognition site for a nuclease, and (ii) preferably has a protruding strand 
capable of forming a duplex widi a complementary prouiiding strand of the target 
polynucleotide. At each cycle in the latter embodiment, only those probes whose 
protruding strands form perfecfly matched duplexes widi the protruding suand of die 

30 target polynucleotide are ligated to die end of die target polynucleotide to form a 

Ugated complex. After removal of die unligated probe, a nuclease recognizing die probe 
cuts the ligated complex at a site one or more nucleotides firom die ligation site along 
die target polynucleotide leaving an end, usually a protruding strand, capable of 
participating in the next cycle of ligation and cleavage. An important feature of die 

35 nuclease is diat its recognition site be separate from its cleavage site. As is described 
more fully below, in die course of such cycles of ligation and cleavage, die terminal 
nucleotides of the target polynucleotide are identified. 
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In one aspect of tbeinvention, more than one nucleotide at the terminus of a 
target polynucleotide can be identified and/or cleaved during each cycle of the method 

Generally, the method of the invention comprises the foUowmg steps: (a) 
ligating a probe to an end of the polynucleotide, the probe having a nuclease recognition 
S site; (b) identifying one or more nucleotides at the end of the polynucleotide; (c) 

cleaving the polynucleotide with a nuclease recognizing the nuclease recognition site of 
the probe such that the polynucleotide is shortened by one or more nucleotides; and (d) 
repeating steps (a) through (c) until the nucleotide sequence of the polynucleotide is 
determined. As is described more fully below, the order of steps (a) through (c) may 
10 vary with different embodiments of the invention. For example, identifying the one or 
more nucleotides can be carried out either before or after cleavage of the ligated 
complex from the target polynucleotide. Likewise, ligating a probe to the end of the 
polynucleotide may follow the step of identifying in some preferred embodiments of the 
invention. Preferably, the method further includes a step of removing the unligated 
15 probe after the step of ligating. 

Preferably, whenever natural protein endonucleases are employed as the 
nuclease, the method further includes a step of methylating the target polynucleotide at 
the start of a sequencing operation to prevent spurious cleavages at internal recognition 
sites fortuitously located in the target polynucleotide. 
20 The present invention overcomes many of the deficiencies inherent to current 

methods of DN A sequencing: there is no requirement for the electrophoretic separation 
of closely-sized DN A fragments; no difficult-to-automate gel-based separations are 
required; no polymerases are required for generating nested sets of DNA sequencing 
fragments; detection and analysis are greaUy simplified because signal-to-noise ratios are 
25 much more favorable on a nucleotide-by-nucleotide basis, permitting smaller sample 
sizes to be employed; and for fluorescent-based detection schemes, analysis is further 
simplified because fluorophores labeling different nucleotides may be separately 
detected in homogeneous solutions rather than in spatially overlapping bands. 

Ihe present invention is readily automated, both for small-scale serial operation 
30 and for large-scale parallel operation, wherein many target polynucleotides or many 
segments of a single target polynucleotide are sequenced simultaneously. Unlike 
present sequencing approaches, the progressive nature of the method-that is, 
determination of a sequence nucleotide-by-nucleotide-permits one to monitor the 
progress of die sequencing operation in real time which, in turn, permits the operation 
35 to be curtailed, or re-started, if difficulties arise, thereby leading to significant savmgs in 
time and reagent usage. Also unlike current approaches, the method permits the 
simultaneous determination of allelic forms of a target polynucleotide: As described 
more fully below, if a population of target polynucleotides consists of several 
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subpopulations of distinct sequences, e.g. polynucleotides from a hetero^gous genetic 
locus, then the method can identify the proportion of each nucleotide at each position in 
the sequence. 

Generally, the method of the invention is applicable to all tasks where DNA 
5 sequencing is employed, including medical diagnostics, genetic mapping, gencAc 
identification, forensic analysis, molecular biology research, and the like. 



Brief Description of the Drawings 
Figure la illustrates a preferred stiiicture of a labeled probe of the invention. 
10 Rgure lb illustrates a probe and terminus of a target polynucleotide wherein a 

separate labeling step is employed to identify one or more nucleotides in the protruding 
strand of a target polynucleotide. 

Figure Ic illustrates steps of an embodiment wherein a nucleotide of the target 
polynucleotide is identified by extension with a polymerase in the presence of labeled 
1 5 dideoxynucleoside triphosphates followed by their excision, strand extension, and strand 
displacement 

Figure Id diagrammatically illusurates an embodiment in which nucleotide 
identification is carried out by polymerase extension of a probe strand in the presence of 
labeled chain-terminating nucleoside triphosphates. 
20 Figure le diagrammatically illustrates an embodiment in which nucleotide 

identification is carried out by polymerase extension in the presence of unlabeled chain- 
terminating 3 -amino nucleoside triphosphates followed by ligation of a labeled probe. 

Figure If illustrates probe assembly at the end of a target polynucleotide having 
a S' protruding strand. 

25 Figure 1 g illustrates probe assembly at the end of a target polynucleotide having 

a 3' protruding strand. 

Figure 2 illustrates the relative positions of the nuclease recognition site, ligation 
site, and cleavage site in a ligated complex. 

Figures 3a through 3h diagrammatically illustrate the embodunent referred to 
30 herein as "double stepping," or the simultaneous use of two different nucleases in 
accordance with the invention. 

Figures 4a through 4d illusttate data showing the fidelity of nucleotide 
identification through ligation witii a ligase. 

Figures 5a through 5c illusttate data showing nucleotide identification flu*ough 
35 polymerase extension. 
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DefinirioTij; 

As used herein "sequence determination'' or "detennining a nucleotide sequence" 
in refeience to polynucleotides includes determination of partial as wdl as full sequence 
information of the polynucleotide* That is» the term includes sequence comparisons, 
S fingerprinting* and like levels of information about a target polynucleotide, as well as die 
express identification and ordering of nucleosides, usually each nucleoside, in a target 
polynucleotide. 

"Peifectiy matched duplex" in reference to the protruding strands of probes and 
target polynucleotides means thai the protruding strand from one forms a double 

10 stranded strucmre witii the otiier such tiiat each nucleotide in the double stranded 
stmcture undergoes Watson-Crick base pairing wifli a nucleotide on the opposite 
strand. The term also comprehends the pairing of nucleoside analogs, such as 
deoxymosine, nucleosides with 2-aminopurine bases, and the like, that may be employed 
to reduce the degeneracy of the probes. 

15 The term "oligonucleotide" as used herein includes linear oligomers of 

nucleosides or analogs tiiereof, including deoxyribonucleosides, ribonucleosides, and die 
like. Usually oligonucleotides range in size fix)m a few monomeric units, e.g. 3-4, to 
several hundreds of monomeric units. Whenever an oligonucleotide is represented by a 
sequence of letters, such as " ATGCCTG," it will be understood fliat die nucleotides are 

20 in 5'->3* order from left to right and that "A" denotes deoxyadenosine, "C" denotes * 
deoxycytidine. "G" denotes deoxyguanosine, and "T" denotes tiiymidine, unless 
otherwise noted. 

As used herein, "nucleoside" includes the natural nucleosides, including 2*-deoxy 
and 2 -hydroxyl forms, e.g. as described in Romberg and Baker, DNA Replication, 2nd 
25 Ed. (Freeman, San Francisco, 1992). "Analogs" in reference to nucleosides includes 
synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. 
described generally by Scheit, Nucleotide Analogs (John Wiley, New York, 1980). 
Such analogs include syntiietic nucleosides designed to enhance binding properties, 
■ reduce degeneracy, increase specificity, and the like. 

30 

Detailed Description of the. Tnv£-.ntinn 
The invention provides a method of sequencing nucleic acids which obviates 
electrophorctic separation of similarly sized DNA fragments and which eliminates die 
difficulties associated widi die detection and analysis of spatially overlapping bands of 
35 DNA firagments in a gel or like medium. Moreover, the invention obviates die need to 
generate t)NA fi-agments from long smgle stranded templates widi a DNA polymerase. 

As mentioned above an important feature of die invention are die probes ligated 
to die target polynucleotide. Generally, die probes of the invention provide a 
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"platfonn" from which a nuclease cleaves the target polynucleotide to which probe is 
ligated. Probes of the invention can also provide a means for identifying or labeling a 
nucleotide ai the end of the target polynucleotide. Probes do not necessarily provide 
bodi functions in every embodimrat 
S In one aspect of the invention* probes have die form illustrated in Figure la In 

this embodiment probes are double stranded segments of DNA having a protruding 
strand at one end 10, at least one nuclease recognition site 12, and a spacer region 14 
between the recognition site and die protruding end 10. Preferably, probes also include 
a label 16, which in tiiis particular embodiment is illustrated at the end opposite of the 
10 protruding strand. The probes may be labeled by a variety of means and at a variety of 
locations, the only restriction being tiiat the labeling means selected does not interfere 
with the ligation step or with the recognition of the probe by the nuclease. 

In the above embodiment, whenever a nuclease leaves a 5* phosphate on the 
terminus of Uie target polynucleotide, it is sometimes desirable to remove die it, e.g. by 
15 treatment with a standard phosphatase, prior to ligation. This prevents undesired 
ligation of one of die sUrands, when the protruding strands of the probe and target 
sequence fail to form a perfectiy matched duplex. This is particularly problematic witii a 
mismatch occurs precisely at the nucleotide position where identification is sought 
Whwe such phosphatase treatment is employed, the "nick" remaining in the ligated 
20 complex after the initial ligation can be repaired by kinase treatment followed by a 
second ligation step. 

Preferably, embodiments of the invention employing the above type of probe 
comprise die following steps: (a) ligating a probe to an end of die polynucleotide 
having a prouiiding strand to form a ligated complex, the probe having a 
25 complementary protruding strand to diat of the polynucleotide and die probe having a 
nuclease recognition site; (b) identifying one or more nucleotides in the protruding 
strand of the polynucleotide, e.g- by die identity of the ligated probe; (c) cleaving die 
ligated complex with a nuclease; and (d) repeating steps (a) through (c) until die 
nucleotide sequence of die polynucleotide is determined. The step of identifying can 
.30 take place tithtr before or after die step of cleaving. Preferably, die one or more 
nucleotides in the prounding strand of the polynucleotide are identified prior to 
cleavage. In further preference, the mediod also includes a step of removing unligated 
probe from the ligated complex. 

It is not critical whether protrading strand 10 of die probe is a 5' or 3' end. 
35 However, in this embodiment, it is important that the protruding strands of the target 
polynucleotide and probes be capable of forming perfecdy matched duplexes to allow 
for specific ligation, ff the protruding strands of die target polynucleotide and probe are 
different lengdis die resulting gap can be fdled in by a polymerase prior to ligation, e,g. 
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as in "gap LCR" disclosed in Backman et al, European patent application 91 100959.5. 
Such gap filling can be used as a means for identifying one or more nucleotides in the 
protruding strand of the target polynucleotide. Preferably, the number of nucleotides in 
the respective protruding strands are the same so diat both strands of the probe and 
5 target polynucleotide aie capable of being ligated without a filling step. Piefeixtbly, the 
protruding strand of the probe is from 2 to 6 nucleotides long. As indicated below, the 
greater the length of the protiuding strand, the greater the complexity of the probe 
mixture that is applied to the target polynucleotide during each ligation and cleavage 
cycle. 

10 In another aspect of the invention, the primaiy function of the probe is to 

provide a site for a nuclease to bmd to the ligated complex so that the complex can be 
cleaved and the target polynucleotide shortened. In this aspect of the invention, 
identification of the nucleotides can take place separately from probe ligation and 
cleavage. This embodiment provides several advantages: First, sequence determination 

1 5 does not require that the protruding strand of the ligated probe be perfectly 

complementary to the promzding strand of the target polynucleotide, thereby permitting 
greater flexibility in the control of hybridization sttingency. Second, one need not 
provide a fully degenerate set of probes based on the four natural nucleotides. So-called 
"wild card" nucleotides, or "degeneracy reducing analogs" can be provided to 

20 significantly reduce, or even eliminate, the complexity of the probe mbcture employed in 
the ligation step, since specific binding is not critical to nucleotide identification in this 
embodiment. Third, if identification is not carried out via a labeling means on the probe, 
then probes designed for blunt end ligation may be employed with no need for using 
degenerate mixtures. 

25 Preferably, this embodiment of the invention comprises the following steps: (a) 

providing a polynucleotide having a protruding strand; (b) identifying one or more 
nucleotides in the protruding strand by extending a 3' end of a strand with a nucleic acid 
polymerase, (c) ligating a probe to an end of the polynucleotide to form a ligated 
complex; (d) cleaving the ligated complex with a nuclease; and (e) repeating steps (a) 

30 through (d) undl the nucleotide sequence of the polynucleotide is determined. 

Preferably, the target polynucleotide has a 3' recessed strand which is extended by the 
nucleic acid polymerase in the presence of chain-terminating nucleoside ttiphosphates, 
and the nuclease used produces a 3'-recessed strand and 5* protruding strand at the 
terminus of the target polynucleotide. 

35 An example of this embodiment is illusurated in Figure lb: The 3' recessed 

sU^d of polynucleotide (15) is extended with a nucleic acid polymerase in the presence 
of the four dideoxynucleoside triphosphates, each carrying a distinguishable fluorescent 
label, so that the 3' recessed strand is extended by one nucleotide (11), which permits its 
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complementary nucleotide in the 5' protruding strand of polynucleotide (15) to be 
identified. Probe (9) having recognition site (12), spacer region (14), and 
complementary protruding strand (10). is then Ugated to polynucleotide (15) to form 
ligated complex (17). Ligated complex (17) is then cleaved at cleavage site (19) to 

5 release a labeled fragment (21) and augmented probe (23). A shortened polynucleotide 
(15) with a regenerated 3' recessed strand is then ready for the next cycle of 
identification, Ugation, and cleavage. 

In such embodiments, the first nucleotide of the 5' protruding strand adjacent to 
the double stranded portion of the target polynucleotide is readily identified by 

10 extending the 3* strand with a nucleic acid polymerase in the presence of chain- 
terminating nucleoside triphosphates. Preferably, the 3* strand is extended by a nucleic 
acid polymerase in the presence of the four chain-terminating nucleoside triphosphates, 
each being labeled with a distinguishable fluorescent dye so that the added nucleotide is 
readily identified by the color of the attached dye. Such chain-terminating nucleoside 

15 triphosphates arc available comma-cially, e.g. labeled dideoxynucleoside triphosphates, 
such as described by Hobbs, Jr. et al, U.S. patent 5,047,519; Ciuickshank, U.S. patent 
5,091.5 19; and die like. Procedures for such extension reactions are described in 
various publications, including Syvanen etal. Genomics, 8: 684-692 (1990); Goeletet 
al. International AppUcation No. PCn'/US92/01905; Livak and Brenner, U.S. patent 

20 5.102,785; and the like. 

A probe may be ligated to die target polynucleotide using conventional 
procedures, as described more fully below. Preferably, the probe is ligated after a single 
nucleotide extension of the 3' sUand of the target polynucleotide. More preferably, the 
number of nucleotides in the protruding strand of the probe is the same as the number of 

25 nucleotides in the protruding strand of the target polynucleotide after the extension step. 
That is, if the nuclease provides a protruding strand having four nucleotides, then after 
the extension step the protruding strand will have three nucleotides and the protruding 
su-and of the preferred probe will have three nucleotides. 

The cleavage step in this embodiment may be accomplished by a variety of 

30 techniques, depending on the effect that the added chain-terminating nucleotide has on 
the efficiencies of the nuclease and/or ligase employed. Preferably, a ligated complex is 
formed with the presence of the labeled chain-terminating nucleotide, which is 
subsequently cleaved with the appropriate nuclease, e.g. a class lis restriction 
endonuclease, such as Fok I, or the like. 

35 In a preferred embodiment, after extension and ligation, the chain-terminating 

nucleotide may be excised. Preferably, this is carried out by the 3'->5' exonuclease 
activity (i.e. proof-reading activity) of a DNA polymerase, e.g. T4 DNA polymerase, 
acting in the presence of the appropriate nucleoside triphosphates. By the action of tiiis 
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enzyme, the chain-terminating nucleoside (1 1) is exchanged with a natural count^ari 
and the strand is extended, displacing the miligated probe strand (25). Conveniently, 
when probes having protruding stands are employed, this step simultaneously caps the 
target polynucleotides that failed to ligate to a probe in a preceding ligation step by 

5 '^filling in" their ends thereby preventing subsequent ligation. 

Such excision may also be carried out chemically, provided that the labeled 
chain-terminating nucleoside is attached by a labile bond, such as an acid-labile 
phosphoramidate bond. Synthesis of such nucleoside phosphoramidates and their use 
with DNA polymerases are described in Letsinger et al, J. Am. Chem. Soc, 94: 292-293 

10 (1972) and Utsinger et al, Biochem., 15: 2810-2816 (1976). After identification, the 
phosphoramidate bond is cleaved and the nucleoside excised by mild acid to leave a 
terminal phosphate group which must be removed with a 3* phosphatase prior to the 
next cycle. 

In another embodiment, the chain-terminating nucleotide is excised and the 

15 recessed 3' strand extended before ligation leaving a blunt-ended target polynucleotide. 
A subsequence cycle is then initiated by ligation of a blunt-ended probe to the end of the 
target polynucleotide. The use of a probe with a blunt end eliminates the need to 
employ multiple probes, because there are no protruding strands that have to be 
hybridized in order for ligation to take place. 

20 In another variation of this embodimenl, a nuclease is selected which leaves a 

one nucleotide 5' protruding strand after digestion, e.g. Alw I. Thus, chain extension 
need not be carried out in the presence of chain-terminating nucleoside triphosphates; 
ordinary deoxynucleoside triphosphates can be employed to leave a flush-«ided 
polynucleotide. A blunt-ended probe is then used to initiate the next cycle. Preferably, 

25 the nucleoside triphosphates used are labeled, as would be the chain-terminating analogs 
described in the above embodiments. In further preference, the label is attached by way 
of a selectively cleavable bond, so that the label can be removed to enhance the 
efficiency of the nuclease in the subsequent cycle. Several such cleavable linkage 
moieties are available, e.g. Herman et al, AnaL Biochem., 156:48-55 (1986)(disulfide 

30 linker); Urdea U.S. patents 4,775,619 and 5,1 18,605. 

In yet another aspect of this embodiment, after ligation, a 3' end of a strand of 
die probe is extended with a DNA polymerase in the presence of labeled chain- 
terminating nucleoside triphosphates, as illustrated in Figure Id. There target 
polynucleotide (15) having a 3* protruding end is ligated to probe (130) having a 

35 complementary 3* protruding end (134) one nucleotide less in length. That is, when the 
3' protruding strand (134) of probe (130) has three nucleotides, the 3* protruding strand 
of target polynucleotide (15) would have at least four nucleotides. Ligation results in 
the formation of ligated complex (17) with gap (132). Gap (132) is then filled by 
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extending 3* protruding end (134) with a nucleic acid polymerase in the piesence of 
chain-terminating nucleoside triphosphates. After cleavage* the cycle can be repeated. 

This embodiment may also be implemented with unlabeled chain-teimmating 
nucleoside triphosphates, as illustrated in Figure le. Target polynucleotide (IS) is 

5 successively exposed to different 3 -arainonucleoside triphosphates in the presence of a 
nucleic acid polyraaase (150). Hie 3 -aminonucleoside triphosphates act as chain- 
terminators when incorporated. For example. 3 -aminoadenosine triphosphate (152) 
shown incorporated in Figure le stops further strand extension and reduces the length 
of the protruding strand by one nucleotide, from 4 to 3. After such exposure, probe 

10 (154) with label (155) corresponding to the adenosine chain-terminator is mixed widi 
the target sequence for ligation (156). As the labeled probe has a protruding strand of 3 
nucleoddes, it will only ligate if there has been an extension. If no ligation takes place, 
and no probe remains attached after washing, then the next 3'-aminonucleoside 
triphosphate and correspondmg probe are tried. This process continues until the target 

1 5 polynucleotide is successfully extended and a corresponding probe is ligated to form 
ligated complex (17). The synthesis of 3-aminonucleside triphosphates are described in 
Kutateldze et al, FEBS Letters. 153: 420-426 (1983), Krayevsky et al, Biochimica et 
Biophysica Acta. 783: 216-220 (1984), and Herrlein et al, Helvetica Chimica Acto. 77: 
586-598 (1994), Hie ligation properties of oligonucleotides havmg terminal 3 - 

20 aminonucleoside is described in Fung and Gryaznov, International application 
PCT/US94/03087. The chain terminating properties of 3'-aminonucleotides are 
described in Herrlein et al (cited above). 

In yet another embodiment of the invention, a probe is assembled at the end of a 
target polynucleotide in two steps, as illustrated by the example in Figure If. A first 

25 single stranded oligonucleotide (100) having a 5* monophosphate is annealed to and 
ligated with target polynucleotide (15) having a 5' monophosphate on its protruding 
strand to form a precursor (104) to ligated complex (17). A second single stranded 
oligonucleotide (102) complementary to the protruding strand of precursor (104) is 
annealed to and ligated with precursor (104) to form ligated complex (17). As with the 

30 double sttanded probes described more fiiUy below, first oligonucleotide (100) may be 
delivered to the target polynucleotide as a mixture and ligation preferably takes place at 
high stringency so diat perfectly matched hybrids (between the protruding strand of the 
target polynucleotide and the 5* end of the first oligonucleotide) are preferentially 
ligated. Clearly, second oligonucleotide (102) need only have a sequence 

35 complementary to the protruding portion of precursor (104) so that a second ligation 
can take place to form ligated complex (17). 

In another form of diis embodiment, illustrated in Figure Ig, a first single 
stranded oligonucleotide (120) is annealed to and ligated with target polynucleotide (15) 
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having a S' monophosphate on its lecessed strand to foim a precursor (124) to ligated 
complex (17). A second single stranded oligonucleotide (122) complementary to the 
protruding strand of precursor (124) and having a 5' monophosphate is annealed to and 
ligated with precursor (124) to form ligated complex (17). As with the double su:anded 
S probes described more fiiUy below, first oligonucleotide ( 120) may be delivered to the 
target polynucleotide as a mixtur& and ligation preferably takes place at high stringency 
so that perfectly matched hybrids (between the protruding strand of the target 
polynucleotide and the 3' end of the first oligonucleotide) are preferentially ligated. As 
above, second oligonucleotide (122) need only have a sequence complementary to the 
10 prottuding portion of precursor (124) so that a second ligation can take place to form 
ligated complex (17). 

The complementary strands of the probes are conveniendy synthesized on an 
automated DNA synthesizer, e.g. an Applied Biosystems, Inc. (Foster City, California) 
model 392 or 394 DNA/RNA Synthesizer, using standard chemistries, such as 
IS phosphoramidite chemistry, e.g. disclosed in the following references: Beaucage and 
Iyer, Tetrahedron, 48: 2223-231 1 (1992); Molko el al, U.S. patent 4,980,460; Koster el 
al, U.S. patent 4,725.677; Caruthers et al. U.S. patents 4,415,732; 4,458,066; and 
4,973,679; and the like. Alternative chemistries, e.g. resulting in non-natural backbone 
groups, such as phosphorothioate, phosphoramidate, and the like, may also be employed 
20 provided that the resulting oligonucleotides are compatible with the ligation and 

cleavage reagents. After synthesis, the complementary surands are combined to form a 
double stranded probe. Generally, the protruding strand of a probe is synthesized as a 
mixture, so that every possible sequence is represented in the protruding portion. For 
example, if the protruding portion consisted of four nucleotides, in one embodiment 
25 four mixtures are prepared as follows: 

X1X2 . , . XiNNNA, 
X1X2 . . . XiNNNC, 

30 

.X1X2 . . . XiNNNG, and 
X1X2 • . . XiNNNT 

35 where the "NNNs".represent every possible 3-mer and the "Xs" represent the duplex 
forming portion of die strand. Thus, each of the four probes listed above contains 4^ or 
64 distinct sequences; or, in other words, each of the four probes has a degeneracy of 
64. For example, X|X2 ... X^NNNA contains die following sequences: 
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5 



X1X2 . 

X1X2 . 

X1X2 . 

X1X2 , 

X1X2 . 



. X^AAGA 
. XiAATA 
- XiACAA 



10 



15 



X1X2 
X1X2 
X1X2 
X1X2 
X1X2 



XiTGTA 
XiTTAA 
XiTTCA 
XiTTGA 
XiTTTA 



Such mixtures arc readily synthesized using well known techniques, e.g. as disclosed in 
Telenius et al, Genomics, 13: 7 18-725 (1992); Welsh et al. Nucleic Acids Research, 19: 
5275-5279 (1991); Grothues et al, Nucleic Acids Research, 21: 1321-1322 (1993); 

20 Hartley, European patent application 90304496.4; and the like. Generally, these 

techniques simply caU for the application of mixtures of the activated monomers to the 
growing oUgonucleotide during the coupling steps where one desires to introduce the 
degeneracy. As discussed above, in some embodiments it may be desirable to reduce 
the degeneracy of the probes. This can be accomplished using degeneracy reducing 

25 analogs, such as deoxyinosine, 2-aminopurine, or the like, e.g. as taught in Kong Thoo 
Lin et al, Nucleic Acids Research, 20: 5149-5152, [or by] U.S. patent 5,002,867; 
Nichols et al, Nattire, 369: 492-493 (1994); and the like. 

Preferably, for oligonucleotides with phosphodiester linkages, the duplex 
forming region of a probe is between about 12 to about 30 bascpairs in length; more 

30 preferably, its length is between about 15 to about 25 basepairs. 

From the above, it is clear that the probes can have a wide variety of forms. For 
example, the probes can have the form X1X2 ... XjANNN, X]X2 ... XfNANN, X1X2 
... XjNNAN, or the like. Or, the number of probe sets could be increased and the 
degeneracy reduced by constructing 16 sets of probes of 16-fold having the form: 

35 X1X2 ... XiNNAA, X1X2 ... XiNNAC, X1X2 ... XiNNAG, and so on. 

It is not crucial that the duplex forming region of each such set of probes have 
the same length. Size differences among the probes can be used as a means for 
identifying them, e.g. Skolnick et al, Genomics, 2: 273-279 (1988), Also, in some 
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embocUments, it may be desirable to synthesize the piobe as a single polynucleotide 
which contains self-complementary regions. After synthesis, the self-complementary 
legions are allowed to anneal to form a probe with a protruding surand at one end and a 
single stranded loop at the other end. Preferably, in such embodiments the loop region 
5 may comprise from about 3 to 10 nucleotides, or otfier comparable linking moieties, e.g, 
alkylether groups, such as disclosed in U.S. patent 4,914,210. Many techniques are 
available for attaching reactive groups to the bases or intemucleoside linkages for 
labeling, as discussed below. 

When conventional ligases arc employed in the invention, as described more 
10 fiiUy below, the 5* end of the probe may be phosphorylated in some embodiments. A 5* 
monophosphate can be attached to a second oligonucleotide either chemicaDy or 
enzymatically with a kinase, e.g. Sambrook et al, Molecular Cloning: A Laboratoiy 
Manual, 2nd Edition (Cold Spring Harbor Laboratory, New Yoik, 1989). Chemical 
phosphorylation is described by Horn and Urdea, Tetrahedron Lett, 27: 4705 (1986), 
15 and reagents for carrying out the disclosed protocols are commercially available, e.g. 5' 
Phosphate-ON™ Clontech Laboratories (Palo Alto. California). Thus, in some 
embodiments, probes may have the form: 

5 • -X1X2 . . . XiTTGA 
20 Y1Y2 ... Yip 

the form: 

5 • -PAGTTX1X2 . . . Xi 
25 yiY2 ... Yi 

or the like, where the Y's are the complementary nucleotides of the X's and "p" is a 
monophosphate group. 

The probes of the invention can be labeled in a variety of ways, including the 

30 direct or indirect attachment of radioactive moieties, fluorescent moieties, colorimetric 
moieties, and the like. Many comprehensive reviews of methodologies for labeling 
DNA and constructing DNA probes provide guidance applicable to constructing probes 
of die present mvention. Such reviews include Matthews et al, Anal. Bioch&m.. Vol 
169, pgs. 1-25 (1988); Haugland, Handbook of Fluorescent Probes and Research 

35 Chemicals (Molecular Probes, Inc., Eugene, 1992); Keller and Manak, DNA Probes, 
2nd Edition (Stockton Press, New York, 1993); and Eckstein, editor. Oligonucleotides 
and Analogues: A Practical Approach (IRL Press, Oxford, 1991); Wetmur, Critical 
Reviews in Biochemisuy and Molecular Biology. 26: 227-259 (1991); and the like. 
Many more particular methodologies applicable to the invention are disclosed in the 
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foUowing sample of itefetences: Connolly, Nucleic Acids.Research, Vol. lS»pgs.3131- 
3139 (1987); Gibson et al. Nucleic Acids Research, Vol. 15, pgs. 6455-6467 (1987); 
Spoat et al. Nucleic Adds Research, Vol 15, pgs. 4837-4848 (1987); Fung et al, U.S. 
patent 4,757,141; Hobbs, Jr., et al U.S. patent 5,151,507; Cruickshank, U.S. patent 

5 5,091,519; (synthesis of funcdonalized oligonucleotides for attachment of reporter 
groups); Jablonski et al Nucleic Acids Research, 14: 61 15-6128 (1986)(enzyme- 
oligonucleotide conjugates); and Urdeaet al, U.S. patent 5,124,246 (branched DNA). 
Attachment sites of labeling moiedes aie not critical in embodiments relying on piobe 
labels to identify nucleotides in the target polynucleotide, provide that such labels do not 

10 interfere with the ligation and cleavage steps. In particular, dyes may be conveniendy 
attached to the end of the probe distal to the target polynucleotide on either die 3* or 5* 
termmi of strands making up die probe, e.g. Eckstein (cited above), Fung (cited above), 
and die like. In some embodiments, attaching labeling moiedes to interior bases or 
inter-nucleoside linkages may be preferred. 

15 Preferably, the probes are labeled with one or more fluorescent dyes, e.g. as 

disclosed by Menchen et al, U.S. patent 5,188,934; Begot et al PCX application 
PCTAJS90/05565. 

In accordance with the invention, a probe of die invention is ligated to an end of 
a target polynucleotide to form a ligated complex in each cycle of ligation and cleavage. 

20 The ligated complex is the double stranded structure formed after probe and target are 
ligated, usually after the protruding strands of die target polynucleotide and probe 
anneal and at least one pair of the identically oriented strands are caused to be 
covalendy linked to one another. Ligation can be accomplished eidier enzymatically or 
chemically. Chemical ligation mediods are well known in the art, e.g. Ferris et al, 

25 Nucleosides & Nucleotides, 8: 407-4 1 4 (1 989); Shabarova et al. Nucleic Acids 

Research, 19: 4247-4251 (1991); and die like. Preferably, however, ligation is carried 
out enzymatically using a ligase in a standard protocol. Many ligases are known and are 
suitable for use in die invention, e.g. Lehman, Science, 186: 790-797 (1974); Engler el 
al, DNA Ligases, pages 3-30 in Boyer, editor. The Enzymes, Vol. 15B (Academic 

30 Press, New York. 1982); and die like. Preferred ligases include T4 DNA ligase, T7 
DNA ligase, E. coli DNA ligase, Taq ligase, Pfu ligase, and Tdi ligase. Protocols for 
tiieir use are well known, e.g. Sambrook et al (cited above); Barany, PCR Mediods and 
Applications, 1:5-16 (1991); Marsh et al. Strategies, 5: 73-76 (1992); and die like. 
Generally, ligases require diat a 5' phosphate group be present for ligation to die 3' 

35 hydroxyl of an abutting strand. This is conveniendy provided for at least one strand of 
die target polynucleotide by selecting a nuclease which leaves a 5' phosphate, e.g. as 
FokL 
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In-a prefened embodiment of die invention employing unphosphorylated probes, 
the step of ligating includes (i) ligating the probe to die target polynocleodde with a 
ligase so that a ligated complex is foimed having a nick on one strand* (ii) 
phosphotylating the 5' bydroxyl at the nick widi a kinase usmg conventional protocols, 

5 e.g. Sambrook et al (cited above), and (iii) ligating again, to covalendy join die strands at 
the nick, Le. to remove the nick. 

Preferably, a target polynucleotide for use in die invention is double stranded 
and is prepared so that it has a protruding strand at least one end. The protruding strand 
may be either S* or 3' and, preferably, the number of nucleotides in the protruding 

10 portion of the strand is in the range of from 2 to 6. A target polynucleotide is rrferred 
to as **-k*' where k is some integer, e.g. usually betMreen 2 and 6, whenever the 5' strand 
is protruding. Conversely, a target polynucleotide is referred to as "+k" whenever the 3' 
strand is protruding. For example the following would be a -4 target polynucleotide in 
accordance with this nomenclature: 

15 

5 • - AACGTTTAC . • . 

AAATG . . . 



In one preferred embodiment of the invention, the target polynucleotide is 

20 anchored to a solid phase support, such as a magnetic particle, polymeric microsphere, 
filter material, or the like, which permits the sequential application of reagents without 
complicated and time-consuming purification steps. The length of the target 
polynucleotide can vary widely; however, for convenience of preparation, lengths 
employed in conventional sequencing are preferred. For example, lengths in the range 

25 of a few hundred basepairs, 200-300, to 1 to 2 kilobase pairs are preferred. 

The target polynucleotides can be prepared by various conventional methods. 
For example, target polynucleotides can be prepared as inserts of any of the 
conventional cloning vectors, including those used in conventional DNA sequencing. 
Extensive guidance for selecting and using appropriate cloning vectors is found in 

30 Sambrook et al. Molecular Cloning: A Laboratory Manual, Second Edition (Cold 

Spring Harbor Laboratory, New York, 1989), and like references. Sambrook et al and 
Innis et al, editors, PCR Protocols (Academic Press, New York, 1990) also provide 
guidance for using polymerase chain reactions to prepare target polynucleotides. 
Preferably, cloned or PCR-araplified target polynucleotides are prepared which permit 

35 attachment to magnetic beads, or other solid supports, for ease of separatijig the target 
polynucleotide from other reagents used in the method. Protocols for such preparative 
techniques are described fully in Wahlberg et al. Electrophoresis, 13: 547-551 (1992); 
Tong et al. Anal. Chem., 64: 2672-2677 (1992); Hultman et al. Nucleic Acids Research, 
17: 4937-4946 (1989); Hulttnan et al, Biotechniques, 10: 84-93 (1991); Syvanen et al. 
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Nucleic Acids Research, 16: 1 1327-1 1338 (1988); Dattagupta et al, U.S. patent 
4,734.363; Uhlen, PCT appUcation PCT/GB89/00304; and like rrfeienccs. Kits are also 
commercially available for practicing such methods, e.g. Dynabeads™ template 
preparation kit fiDm Dynal AS. (Oslo, Norway). 

Populations of target polynucleotides may be prepared in parallel by die use of 
microparticlcs, e.g. magnetic beads, controUed pore glass particles, or the like, that each 
have a uniform population of adaptors attached. The adaptor is an oligonucleotide 
between about 30 and 100 nucleotides in length diat comprises regions for PGR primer 
binding, regions that form restriction endonuclease cleavage sites when duplexes are 
established, and an address region of about 12-15 nucleotides that permits capture of a 
unique target polynucleotide by hybridization. Such adaptors may also comprise other 
linking moieties known in the art, e.g. polyethylene glycol arms, or the like. The 
population of adaptors on a particular raicroparticle is uniform in the sense that each 
oligonucleotide has the same sequence, so tiiat the same target polynucleotide would be 
captured by different adaptors on tiie same microparticle. Preparation of microparticle 
with uniform populations of oligonucleotides is disclosed in PCT publications WO 
92/00091, WO 92/03461, and like references. For parallel sequencing, target 
polynucleotides are prepared in a Ubraiy whose vector contains complementary address 
regions adjacent to tiie target polynucleotide insert. After excision and denaturing, tiie 
population of target polynucleotide-which now each have a complementary address 
region on its terminus--are mixed with a population of microparticles under conditions 
that pennit capture. Individual particles witii captured target polynucleotides may be 
isolated and manipulated on a microscope slide, e.g. as taught by Lam et al, PCT 
publication WO 92/00091 and Lam et al. Science, 354: 82-84 (1991). 

"Nuclease" as the term is used in accordance witii tiie invention means any 
enzyme, combination of enzymes, or other chemical reagents, or combinations chemical 
reagents and enzymes that wheri applied to a ligated complex, discussed more fuUy 
below, cleaves die ligated complex to produce an augmented probe and a shortened 
target polynucleotide. A nuclease of Uie invention need not be a single protein, or 
consist solely of a combination of proteins. A key feature of the nuclease, or of die 
combination of reagents employed as a nuclease, is tiiat its (dieir) cleavage site be 
separate from its (tiieir) recognition site. The distance between tiie recognition site of a 
nuclease and its cleavage site will be referred to herein as its "reach." By convention, 
"reach" is defined by two integers which give flic number of nucleotides between die 
recognition site and tiie hydrolyzed phosphodiester bonds of each strand. For example, 
die recognition and cleavage properties of Fok I is typically represented as 
"GGATG(9/13)" because it recognizes and cuts a double stranded DNA as follows: 
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5 where the bolded nucleotides are Fbk Fs recognition site and the N-s are arbitrary 
nucleotides and their complements. 

It is important that the nuclease only cleave the target polynucleotide after it 
forms a complex with its recognition site; and preferably, the nuclease leaves a 
protruding strand on the target polynucleotide after cleavage. 

10 Qeavage with a nuclease can be accomplished using diemical nucleases, e.g. as 

disclosed by Sigman et al, Ann. Rev. Biochem., 59: 207-236 (1990); Le Doan et al. 
Nucleic Acid Research, 15: 7749-7760 (1987); U.S. patent 4,795,700; Francois et al, 
Proc. Natl. Acad. ScL, 86: 9702-9706 (1989); and like references. Preferably, such 
embodiments comprise an oligonucleotide moiety linked to a cleavage moiety, wh^in 

IS the oligonucleotide moie^r recognizes the Ugated complex by triple helix formation. 
There is extensive guidance in the literature for selecting appropriate sequences, 
orientation, conditions, nucleoside type (e.g. whether ribose or deoxyribose nucleosides 
are employed), base modifications (e.g. methylated cytosine, and the like) in order to 
maximize, or otherwise regulate, triplex stability as desired in particular embodiments, 

20 e.g. Roberts et al, Proc. Nad. Acad. ScL, 88: 9397-9401 (1991); Roberts et al. Science, 
258: 1463-1466 (1992); Distefano et al, Proc. Natt. Acad. ScL, 90: 1179-1183 (1993); 
Mergny et al, Biochemistty, 30: 9791-9798 (1991); Cheng et al, J. Am. Chem. Soc., 
1 14: 4465-4474 (1992); Beal and Dervan, Nucleic Acids Research, 20: 2773-2776 
(1992); Beal and Dervan, J. Am. Chem. Soc, 1 14: 4976-4982 (1992); Giovannangeli et 

25 al, Proc. Nad. Acad. Sci., 89: 8631-8635 (1992); Moser and Dervan, Science. 238: 
645-650 (1987); McShan et al, J. Biol Chem., 267:5712-5721 (1992); Yoon et al, 
Proc. Natl. Acad. Sci., 89: 3840-3844 (1992); Blume et al. Nucleic Acids Research, 20: 
1777-1784 (1992); and the like. Preferably, such chemical nucleases are employed with 
an exonuclease which can produce a protruding strand after cleavage. Although current 

30 chemical nucleases are limited in that their cleavage sites vary around an expected site, 
they can be employed in fingerprinting, sequence comparisons, and other uses that only 
require partial sequence informatioa 

Preferably, nucleases employed in the invention are natural protein 
endonucleases (i) whose recognition site is separate from its cleavage site and (ii) whose 

35 cleavage results in a protruding strand on the target polynucleotide. Most preferably, 
class lis restriction endonucleases are employed as nucleases in the invention, e.g. as 
described in Szybalski et al, Gene, 100: 13-26 (1991); Roberts et al. Nucleic Acids 
Research, 21: 3125-3137 (1993); and Uvak and Brenner, U.S. patent 5,093,245. 
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Exemplary class Us nucleases for use with the invention include Alw XI. Bsm AI, Bbv I, 
Bsm n, Sts I, Hga I, Bsc AI, Bbv Bee fl, Bee 851, Bcc I, Beg I, Bsa I. Bsg I. Bsp 
MI. Bst 7 1 1. Ear I. Eco 571. Esp 31. Fau I, Fok I, Gsu I. Hph I, Mbo H. Mme I. Rle AI. 
Sap I. Sfa NI. Taq n. Tth 1 1 IH, Bco 51, Bpu AI, Fin I. Bsr DI. and isoschizomers 

5 thereof. Preferred nucleases include Fok I, Hga I. Ear I, and Sfa NI. 

Preferably, prior to nuclease cleavage steps, usually at the start of a sequencing 
operation, the target polynucleotide is tteated to block the recognition sites and/or 
cleavage sites of the nuclease being employed. This prevents undesired cleavage of the 
target polynucleotide because of the fortuitous occurrence of nuclease recognition sites 

10 at interior locations in the target polynucleotide. Blocking can be achieved in a variety 
of ways, including methylation and treaunent by sequence-specific aptamers, DNA 
binding proteins, or oUgonucleotides that form triplexes. Whenever natural protein 
endonucleases are employed, recognition sites can be conveniently blocked by 
raethylating the target polynucleotide with the cognate methylase of the nuclease being 

15 used. That is, for most if not all type H bacterial resuiction endonucleases, there exists a 
so-called "cognate" methylases that methylates its recognition site. Many such 
mediylases are disclosed in Roberts et al (cited above) and Nelson et al. Nucleic Adds 
Research, 21: 3139-3154 (1993), and are commercially available from a variety of 
sources, particularly New England Biolabs (Beverly, MA). 

20 In accordance with the invention, after a probe is ligated to the target 

polynucleotide to form a Ugated complex, the ligated complex is cleaved with a nuclease 
to give an augmented probe and a shortened target polynucleotide. Hiis occurs because 
the probe is designed such that the distance from the recognition site of the probe to end 
of the probe is less than the distance from the recognition site to the cleavage site of the 

25 nuclease. Hiat is, the nuclease necessarily cleaves in a region of the target 

polynucleotide, thereby shortening it by one or more nucleotides in each cycle, as 
illustrated in Figure 2. Conversely, in each cycle the probe has one or more nucleotides 
added to it after cleavage to form an augmented probe. In Figure 2, ligated complex 20 
is shown with recognition site 22 of the Fok I nuclease. The temimus 24 of the probe is 

30 one nucleotide to the left of the Fok I cleavage site 26. Thus, in the illustrated 

embodiment, ligation leads to the identification of the termmal thymidme on the target 
polynucleotide and cleavage results in the shortening of each strand of the target 
polynucleotide by one nucleotide. The nucleotides removed by the cleavage together 
with the probe to which they remain attached form an augmented probe. 

35 As mentioned above, the method of die mvention is preferably carried out in the 

following steps: (a) ligating a probe to an end of the polynucleotide having a protruding 
strand to form a ligated complex, the probe having a complementary protruding strand 
to that of the polynucleotide and the probe having a nuclease recognition site; (b) 
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removing unligated probe from the ligated complex; (c) identifying one or more 
nucleotides in the protruding strand of the polynucleotide; (d) cleaving the ligated 
complex with a nuclease; and (e) repeating steps (a) through (d) until the nucleotide 
sequence of the polynucleotide is determined. Identification of the one or more 

5 nucleotides in the protiuding strand of the target polynucleotide is carried out either 
before or after the cleavage step, depending on the embodiment of the invention being 
implementsed. Identification of the one or more nucleotides in the protruding strand of 
the target polynucleotide is carried out either before or after the cleavage step, 
depending on the embodiment of the invention being implemented. Detection prior to 

10 cleavage is preferred in embodiments where sequencing is carried out in pandlel on a 
plurality of sequences (either segments of a smgfe target polynucleotide or a plurality of 
altogether different target polynucleotides), e.g. attached to separate magnetic beads, or 
other types of solid phase supports. Detection either before or after cleavage may be 
carried out in embodiments where a homogeneous population of target polynucleotides 

15 is being analyzed, e.g. a population of solid phase supports, such as magnetic beads, all 
have the identical target polynucleotide attached. In such cases, other factors my 
dictate the ordering of the detection and cleavage steps, such as the detection scheme 
being employed, whether the sequencing reactions are being carried out in separate 
reaction mixtures or whether diey take place in a common mixture, and the like. 

20 In further preference, the method includes a capping step after the unligated 

probe is washed from the target polynucleotide. In a capping step, by analogy with 
polynucleotide synthesis, e.g. Andrus et al, U.S. patent 4,816,571, target 
polynucleotides that have not undergone Ugation to a probe are rendered inert to further 
ligation steps in subsequent cycles. In this manner spurious signals from "out of phase" 

25 cleavages are prevented. When a nuclease leaves a 5' protruding su^d on the target 
polynucleotides, capping is preferably accomplished by exposing the unreacted target 
polynucleotides to a mixture of the four dideoxynucleoside triphosphates, or other 
chain-terminating nucleoside triphosphates, and a DNA polymerase. The DNA 
polymerase extends the 3' surand of the unreacted target polynucleotide by one chain- 

30 terminating nucleotide, e.g. a dideoxynucleotide, thereby rendering it incapable of 
ligating in subsequent cycles. 

Clearly, one of ordinary skill in the art could combine features of the 
embodiments set forth above to design still further embodiments in accordance with the 
invention, but not expressly set forth above. 

35 An important aspect of the invention is "multiple stepping," or the simultaneous 

use of a plurality of nucleases which cleave at different distances from the ligation site to 
sequence a target polynucleotide. The use of multiple nucleases having different 
reaches permits one to periodically "restart" the sequencing process by capping 
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sequences involved in prior or current cycles of Ugation and cleavage and by beginning a 
new cycle of Ugadon and cleavage on a "firesh" set of target polynucleotides whose 
protruding strands are exposed by cleavage with a long reach nuclease. By employing 
mulUple nucleases in this manner the number of nucleotides that can be determined on a 
5 set of target polynucleotides can be increased over that which can be done with a single 
nuclease. 

In using multiple nucleases it is important that one be able to convert the 
protruding stand of a target polynucleotide from one form to another. For example, one 
may wish to apply both Fok I (which leaves a -4 target polynucleotide) and Ear I (which 

10 leave a -3 target polynucleotide) to a target sequence, i.e. "double stepping". As 

described more fully below, in order to do this, one must be able to convert the A target 
polynucleotide to a -3 target polynucleotide without loss of inf onnation. This can be 
accomplished by providing a conversion probe that has the following properties: i) a 
prottuding strand compatible wiUi the cunrent target polynucleotide protruding strand. 

15 Le. having the same number of nucleotides in antiparallel orientation, ii) a nuclease 
recognition site of the nuclease being converted to. and iii) a spacer region selected so 
tiiat flie cut site of the new nuclease corresponds to at least one of tiie ligation sites of 
tiie two strands. Preferably, die conversion probe pemits ligation of only one strand 
and one of die unligated sites. i.e. nicks, is located at die cleavage site of the nuclease 

20 being converted to. 

Figures 3a dirough 3h diagrammatically illusUrate diis aspect of die invention in 
die case where two nucleases are employed, a first nuclease which permits cleavage ten 
nucleotides from die ligation site and a second nuclease which permits cleavage of one 
nucleotide from die ligation site. The process illustrated in figure is readily generalized 

25 to more dian two nucleases. In Figure 3a, a mixture of probes 34 and 36 are Ugated to 
die target polynucleotides 30 attached to soUd phase support 32. Piobe 34 contains a 
nuclease recognition site of a first nuclease diat has a long reach, e.g. ten nucleotides, 
and a short spacer region so diat its associated nuclease cleaves deeply into die target 
polynucleotide. Probe 36 converts (if necessary) die protruding strand of die target 

30 polynucleotides (initially prepared for die first nuclease) to a protruding strand 
corresponding to a second nuclease used to cleave die target polynucleotide one 
nucleotide at a time. Widi die appropriate protruding strand available, the second 
nuclease is employed in nine cycles of ligation and cleavage followed by a capping step 
to give die identity of die first nine nucleotides of the target polynucleotide. As 

35 fflustrated in Figure 3b, capped sequences 38 no longer participate in ligation and 

cleavage cycles. The number of capped sequences produced in dus step depends on the 
mixture of die two probes employed which, in turn, depends on several factors, 
including die lengdi of die target polynucleotide, die nature of die label on die probes, 
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the^fficiencies of ligation and cleavage of the enzymes employed, and the like. The 
target polynucleotides 41 aie then cleaved at 40 with the first nuclease, shown in Figure 
3c, to produce appropriate protruding strands at the teimini of the target 
polynucleotides and the identity of the tenth nucleotide. After cleavage and washing, a 

5 mixture of probes 34 and 36 are ligated to the non-capped target polynucleotides 42 
. (Figure 3d) to form ligated complexes. The ligated complexes including probe 36 are 
cleaved to convert the protruding strands of their associated target polynucleotides to 
protruding strands corresponding to the second nuclease, after which another nine 
cycles of ligation and cleavage take place followed by a capping step, to form a second 

10 set of capped sequences 44 (Figure 3e). In this series of cycles the identities of 
nucleotides 11 through 19 are determined. 

Next die target polynucleotides are cleaved with the first nuclease at 46 (in 
Figure 3f) to produce protruding strands on target polynucleotides 48, after which a 
mfacture of probes 34 and 36 are ligated to the target polynucleotides to form ligated 

15 complexes SO (Figure 3g). The ligated complexes comprising probe 36 are again 

cleaved to convert the prouuding strands of their associated target polynucleotides to 
ones corresponding to the second nuclease, after which nine cycles of ligation and 
cleavage take place followed by a capping step, to form a third set of capped sequences 
52 (Figure 3h). Tins set of cycles leads to the identification of nucleotides 21 through 

20 29. 

This process continues until the nucleotide sequence of die target polynucleotide 
is determined or until the remaining population of target polynucleotides is too small to 
generate a detectable signal. 

The invention includes systems and apparatus for carrying out sequencing 

25 automatically. Such systems and apparatus can take a variety of forms depending on 
several design consu-aints, including i) the nature of the solid phase support used to 
anchor the target polynucleotide, ii) the degree of parallel operation desired, iii) die 
detection scheme employed; iv) whether reagents are re-used or discarded, and the like. 
Generally, Che apparatus comprises a series of reagent reservoirs, one or more reaction 

30 vessels containing target polynucleotide, preferably attached to a solid phase support, 
e.g. magnetic beads, one or more detection stations, and a computer controlled means 
for transferring in a predetenmined manner reagents from the reagent reservoirs to and 
from the reaction vessels and the detection stations. The computer controlled means for 
transferring reagents and controlling temperature can be implemented by a variety of 

35 general purpose laboratory robots, such as that disclosed by Harrison et al, 

Biotechniques, 14: 88-97 (1993); Fujita et al, Biotechniques, 9: 584-591 (1990); Wada 
et al, Rev. ScL lustrum., 54: 1569-1572 (1983); or die like. Such laboratory robots are 
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also available corameiciaUy, e.g. Applied Biosystems model 800 Catafyst (Foster City, 
CA). 

A variety of kits are provide for carrying out different embodiments of the 
invention. GeneraUy, kits of the invention indude probes taUoied for the naclease and 
5 die detection scheme of the particular embodiment. Kits further include the nuclease 
reagents, the Ugation reagents, and instructions for practicing the particular embodiment 
of the invention. In embodiments employing natural protein endonucleases and Ugases, 
ligase buffers and nuclease buffets may be included. In some cases, these bufifeis may 
be identical. Such kits may also include a methyhise and its reaction buffer and a kinase 

10 and its reaction buffer. Preferably, kits also include a soHd phase support, e.g. magnetic 
beads, for anchoring target polynucleotides. In one preferred kit, fluorescently labeled 
probes are provided such that probes corresponding to different temiinal nuckotides of 
the target polynucleotide carry distinct spectrally resolvable fluorescent dyes. As used 
herein, "spectndly resolvable" means that die dyes may be distinguished on basis of ttuai 

15 spectral characteristics, particularly fluorescence emission wavelength, under conditions 
of operation. Thus, the identity of tiie one or more terminal nucleotides would be 
correlated to a distinct color, or perhaps ratio of intensities at different wavelengUis. 
More preferably, four such probes are provided tiiat allow a one-to-one correspondence 
between each of four spectrally resolvable fluorescent dyes and the four possible 

20 terminal nucleotides on a target polynucleotide. Sets of spectrally resolvable dyes are 
disclosed in U.S. patents 4.855,225 and 5.188.934; International application 
PCTAJS90/05565; and Lee et al. Nucleic Acids Research, 20: 2471-2483 (1992). 

Example 1 

Sequencing a Tarf?Kt Pniy nucleotiflp, 

Amplified fmnipH TP 10 
A 368 basepair fragment of pUC19 is amplffied by PGR for use as a test target 
polynucleotide. The 5* terminal nucleotide of tiie coding strand is at position 393 and 
die 3' terminal nucleotide of tiie coding strand is at position 740, Yanisch-Perron et al, 

30 Gene, 33: 103-1 19 (1985), so tiiat the polylinker region is spanned. Two primers 18- 
mer primers employed having sequences S'-AGTGAATTCGAGCTCGGT and 5- 
xCCnTGAGTGAGCTGATA, where "x" is an amino linking group, Aminolinker n 
(AppUed Biosystems, Inc., Foster City. California), to which a biotin moiety is attached 
using manufacturer's protocol, 5' Biotin NIO-Label Kit (Clontech Laboratories, Palo 

35 Alto, California). The amplified target polynucleotide is isolated and attached to 
streptavidin-coated magnetic beads (Dynabeads) using manufactorer's protocol, 
Dynabeads Template Preparation Kit, witij M280-streptavidin (Dynal, Inc., Great Neck, 
New York). A sufficient quantity of the biotinyhted 393 basepair fragment is provided 
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to load about 300 of Dynabeads M280-St]:eptavidin. After loading onto the 
Dynabeads, the target polynucleotides are digested with Eco RI and washed to provide 
a S -monophosphorylated protruding strand with an oveihang of four nucleotides, Le. a 
-4 target polynucleodde, shown below. 

5 

5 ' -pAATTCGAGCTCGGTACCCGGGGATCCTCTA . . . 

GCTCGAGCCATGGGCCCCTAGGAGAT • . . 

Reactions and washes below are generally carried out in SO fiL volumes of 
10 manufacturer's (New England Biolabs*) recommended buffers for Ae enzymes 

employed, unless otherwise indicated. Standard buffers are also described in Sambiook 

et al, Molecular Cloning, 2nd Edition (Cold Spring Harbor Laboratory Press, 1989). 

Note that in this test example, methylation is not required because no Fok I recognition 

sequences are present in the target polynucleotide. 
IS The following four sets of mixed probes are provided for addition to the target 

polynucleotide: 



TAMRA- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGANNN 

20 

FAN- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGCNNN 

ROX- ATCGGATGACATCAAC 
25 TAGCCTACTGTAGTTGGNNN 

JOE- ATCGGATGACATCAAC 

TAGCCTACTGTAGTTGTNNN 



30 where TAMRA, FAM. ROX, and JOE are spectrally resolvable fluorescent labek 
attached by way of Aminolinker U (all being available from Applied Biosystems, Inc., 
Foster City, California); the bold faced nucleotides are the recognition site for Fok I 
endonuclease, and "N" represents any one of the four nucleotides. A, C, G, T. TAMRA 
(tetramethylrhodamine), FAM (fluorescein), ROX (liiodamine X), and JOE (2\7 - 

35 dimethoxy-4*,S*-dichlorofluorescein) and their attachment to oligonucleotides is also 
described in Fung et al, U.S. patent 4,855,225. 

Each of the above probes is separately incubated in sequence in approximately 5 
molar excess of the target polynucleotide ends as follows: the probe is incubated for 60 
minutes at 16^C with 200 units of T4 DNA ligase and die anchored target 

40 polynucleotide in 50 (iL of T4 DNA ligase buffer, after washing, the target 
polynucleotide is then incubated with 100 units T4 polynucleotide kinase in the 
manufacturer's recommended buffer for 30 minutes at 37^C, washed, and again 
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incubated for 30 minutes at 16^C with 200 units of T4 DNA Ugase and tfie anchoied 
target polynucleotide in 50 ^iL of T4 DNA Ugase buffer. Washing is accomplished by 
immobilizing the magnetic bead support with a magnet and successively adding then 
removing 50 \iL volumes of wash buffer, e.g. TE, disclosed in Sambrook at al (cited 

5 above). After the cycle of ligation-phosphorylation-ligation and a final washing, die 
beads are interrogated for the presence of fluorescent label. On the fourth set of such 
incubations, the characteristic fluorescence of JOE is detected indicating that the 
terminal nucleotide is A. The labeled target polynucleotide, i.e. the ligated complex, is 
then incubated with 10 units of Fok I in 50 of manufacturer's recommended buffer 

10 for 30 minutes at 31^C, followed by washing in TE. As a i^ult the target 

polynucleotide is shortened by one nucleotide on each suand and is ready for the next 
cycle of ligation and cleavage. The process is continued until die desired number of 
nucleotides are identified. 

15 Example 2 

Converting a >4 Pr otruding Strand to a 
r3 Protruding Strand 
A -4 protruding strand is converted into a -3 protruding strand using the 
conversion probe shown below having an Ear I recognition site (indicated in bold) and a 
20 protruding sUand whose tenninal nucleotide is non-phosphoiylated (indicated in lower 
case). The conversion probe is ligated to the terminus of die target polynucleotide using 
conditions as described in Example 1 : 

ACTCTTC + pNNNIOTACCGG . . . 

25 TGACAAGNTCSrn ATGGCC 

I 

ACTCTTCNNNNTACCGG . . . 
30 TGAGAAGNNNnATGGCC . . . 

After ligation, die complex is digested widi Ear I using manufacturer's recommended 
protocol to give a target polynucleotide with a -3 protruding suand: 

35 ACTCTTCNNNNTACCGG . . . 

TGAGAAQNNNnATGGCC . . . 

i 

40 
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idiQNNTACCGG 
ATGGCC 



• • • 



5 Example 3 

Converting a -4 PrntrnHi ny Strand tn a 
>5 Protruding RtrA^^^ 
A -4 protrading strand is converted into a -5 protruding strand using the 
conversion probe shown below having an Hga I recognition site (indicated in bold) and 
10 a protruding strand whose t^inal nucleotide is non-phosphoiylated Ondicated in lower 
case). The conversion probe is ligated to the terminus of the target polynucleotide using 
conditions as described in Example 1: 



15 AGACGCCATCAT + pNJSINNTACCGG . . 

TCTGCGGTAGTANNNn ATGGCC 



20 



AGACGCCATCATNNNNTACCGG . . 
TCTGCGGTAGTANNNilATGGCC . . 



After ligation, the complex is digested with Hga I using manufacturer's recommended 
25 protocol to give a target polynucleotide with a -5 protruding strand: 

AGACGCCATCATNNNNTACCGG . . . 
TCTGCGGTAGTANNNnATGGCC . . . 

30 1 



AGACGCCATCA + pTNNNNTACCGG . . . 

TCTGCGGTAGTANNNn ATGGCC 

35 

Example 4 

Converting a +2 Prntniding Strand to a 

-5 Protruding Strand 

40 A +2 protruding strand is converted into a -3 protruding strand using the 

conversion probe shown below having an Ear I recognition site (indicated in bold) and a 
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10 



20 



piotrading stnind whose tenoinal nucleotide is non-phosphorylated (indicated in lower 
case). The conversion probe is ligated to the teiroinus of the target polynucleotide using 

■ 

conditions as described in Example 1 : 



ACTCTTCGNN + pTACCGG 

TOlLCAAGc NNATGGCC 



ACTCTTCGNNTACCGG 
TGAGAAGCNNATGGCC 



IS After ligation, the complex is digested with Ear I using manufacturer's recommended 
protocol to give a target polynucleotide with a *3 protruding strand: 



ACTCTTCGNNTACCGG 
TGAGAAGcNNATGGCC 



25 ACTCTTCG + NNA + pNNTACCGG . . . 

TGACAAGc ATGGCC . . . 



Example 5 

30 Double Stepping: Sequencing hv Ligati on Employing 

Two Restriction EndQnuclea5?e5s 
Two nucleases. Ear I and Fbk I, with different reaches are used in the same 
sequencing operation. The procedure is illustrated in Fig. 3. A 368 basepair fragment 
of pUCi9 with a -4 protruding strand is prepared as described in Example 1. Because 

35 the fragment contains a Ear I site (but no Fok I site), the target polynucleotide is initially 
treated with an Ear I methylase, e.g. as described in Nelson et al. Nucleic Acids 
Research, 17: i398-r415 (1989). Afterwards, a 9:1 mixture of die following two 
probes, Probe A:Probe B, is combined in about 5 molar excess with the target 
polynucleotide, ligated, kinased, and ligated, as described in Example 1 to form two 
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populations of ligated complices: about 10% tenninating with Probe B and about 90% 
terminating with Probe A. 

Probe (A): - ATCGGATG (Fok I recognition site) 

S TAGCCTACNNNki 



Probe (B) : CAGATCCTCTTCa (Ear I recognition site) 

GTCTAGGACAA6TNNNN 



10 The target polynucleotide are then digested with Ear I to convert about 10% of the 
ligated complexes to a target polynucleotide having a -3 protruding strand. The 
following probes are then used in nine cycles of ligation-phosphorylation- 
ligationAdentification/cleavage as described in Example 1 to give the identity of the first 
nine nucleotides. 



15 



20 



Ear I Probes 

TAJMRA- CAGATCCTCTTC 

GTCTAGGAGAAGGNN 

FAM- CAGATCCTCa?TC 

GTCTAGGAGAAGGNN 



ROX- CAGATCCTCTTC 
25 GTCTAGGAGAAGANN 

JOE- CAGATCCTCTTC 

GTCTAGGAGAAGTNN 



30 After the ninth cleavage and washing, the subpopulation of target polynucleotides thai 
underwent the nine cycles of cleavage are capped by treating with a DNA polymerase in 
the presence of die four dideoxynucleoside triphosphates. After washing again, the 
target polynucleotides are digested with Fok I to give target polynucleotides with a -4 
protruding strand. Thus, at this point 10% of the original population of target 

35 polynucleoddes is 9 nucleotides shorter (on average) and capped and 90% are precisely 
9 nucleotides shorter and ready for successive cycles of cleavage and ligation. 

To the Fok I digested target polynucleotides is added an 8: 1 mixture of Probe 
A:Probe B in a ligase buffer as described above. This results in approximately the same 
quantity of target polynucleotide being prepared for Ear I digestion as above. 

40 Alternatively, a constant ratio of Probe ArProbe B could be employed throughout the 
sequencing operation, which would lead to a less intense signal at each successive Fok 1 
digestion step, but may also permit a longer sequence to be determined. Ear I is added 
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to the resulting mixed population of ligated complexes under the manufacturer's 
recommended protocol to convert a subpopulation to target polynucleotides with -3 
protruding strands. The Ear I probes are again applied nine times as described above to 
provide the identity of nucleotides 10 through 18. The process is continued as 
5 described above until the identities of the 90 terminal nucleotides of die target 
polynucleotide are obtained. 

Example 6 

Seouencing a Target Pnlyniidfti^tf 

10 . Amplified from dGEMTZ: Td fintification of 

Nucleotides hv the, T .jgation Reaction 
In tiiis example, a segment of plasmid pGEM7Z (Promega, Madison, WI) was 
amplified and attached to glass beads via a double sttanded DNA linker, one su^d of 
which was syntiiesized direcdy onto (and tiiercfore covalendy linked to) die beads. In 

1 5 each sequencing cycle after ligation, an aliquot of beads vwis removed from the reaction 
mixture and loaded onto a gel electrophoresis column for analyzing the non-covalendy 
bound strand of die ligated complex. The probes were designed so tiiat tiie non- 
covalendy bound strand would always carry a fluorescent label for analysis, 

A 47-mer oligonucleotide was syntiiesized direcdy on KFl 69 Ballotini beads 

20 using a standard automated DNA syntiiesizer protocol. The complementary strand to 
die 47-mer was syntiiesized separately and purified by HPLC. When hybridized die 
resulting duplex has a Bst XI resttiction site at die end distal from die bead. The 
complementary strand was hybridized to die attached 47-mer in die following mixture: 
25 III complementary strand at 200 pmol/fil; 20 mg KFl 69 Ballotini beads widi die 47- 

25 mer; 6 |Ld New England Biolabs #3 resuiction buffer, and 25 fil distilled water. The 
mixture was heated to 93^C and then slowly cooled to 55^0, after which 40 units of 
Bst XI (at 10 units/|il) was added to bring die reaction volume to 60 |il. The mixture 
was incubated at 55^C for 2 hours after which the beads were washed diree times in TE 
(pH 8.0). 

30 The segment of pGEM7Z to be attached to the beads was prepared as follows: 

Two PGR primers were prepared using standard protocols: 

Primer 1: 5 * -CTAAACCATTGGTATGGGCCAGTGAATTGTAATA 

35 Primer 2 : 5 ' -CGCGCAGCCCGCATCGTTTATGCTACAGACTGTC- 

AGTGCAGCTCTCCGATCCA2^ 
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The PCR reaction mixture consisted of the following: 1 ^1 pOEM7Z at i ng/(U; 10 ^1 
primer 1 at 10 pmol/pl; 10 ^1 primer 2 at 10 pmol/^1; 10 ^1 deoxyribonucleotide 
triphosphates at 2.5 mM; 10 jil lOx PCR buffer (Peikin-Ehner); 0.5 jil Taq DNA 
polymerase at 5 units/^1; and 58 \d distilled water to give a fiiml volume of 100 The 

5 reaction mixture was subjected to 25 cycles of 93^C for 30 sec; 60^C for 15 sec; and 
72^C for 60 sec, to give a 172 basepair product, which was successively digested with 
Bbv I (100 pi PCR reaction mixture, 12 lOx # 1 New England Biolabs buffer, 8 ^U 
Bbv I at 1 unit/^1 incubate at 37^C for 6 hours) and with Est XI (to the Bbv I reaction 
mixture was added: 5 ^1 1 M NaCl, 67 (U distilled water, and 8 ^1 Bst XI at 10 units/^ 

10 and the resulting mixture was incubated at 55^C for 2 hours). 

After passing the above reaction mixture through a Centricon 30 (Amicon, Inc.) 
spin column following manufacturer's protocol, the Bbv I/Bst Xl-iestricted fragment 
was ligated to the double stranded linker attached to the Ballotini beads in the following 
mixture: 17 Bbv I/Bst Xl-restricted fragment (10 (ig), 10 jil beads (20 mg), 6 ml lOx 

15 ligation buflfer (New England Biolabs, referred to below as NEB), 5 Jll T4 DNA ligase 
at 2000 units/jil, and 22 jil distilled water, which mixture was incubated at 25^C for 4 
hours, after which the beads were washed 3 times with TE (pH 8.0), leaving the 
following target polynucleotide for sequencing: 

20 ... TCTGTAGCT 



The strands of the following probes (24 nucleotides in labeled strand and 1 8 
nucleotides in non-labeled strand) were separately synthesized on an automated DNA 
25 synthesizer (model 392 Applied Biosystems, Foster City) using standard methods: 



[BEAD] — 



AGACATCGAATTT- 5 



FAM 



30 



5 ' -pGNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG-5 * 



TAMRA 



35 



5 • -pTNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG- 5 



ROX 



40 



pCNNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG-5 * 
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JOE 

I 

5 • -pANNNTACGTGCGCATCCCGAGCQA 
ATGCACGCGTAGGGCTCG-5 * 

wheie p is a monophosphate, N indicates A, C, G» or T, Q is a branched linker carrying 
a protected amino group for attachment of a label (e.g. Uni*Link AminoModifier, 
available from Clontech Laboratories, Palo Alto, CA), and FAM, TAMRA, ROX, and 
JOE are as defined above. 5.0 x 104 pmol of each probe was combined in TE to form 
a mixture at a concentration of 1000 pmol/^ 

Ugadons were carried out in a mixture consisting of S pi beads (20 mg), 3 pi 
NEB lOx ligase buffer, 5 \il probe mix, 2.5 \il NEB T4 DNA Ugase (2000 units/jil), and 
14.5 \il distilled water. The mixture was incubated at 16^C for 30 minutes, after which 
the beads were washed 3 times in TE (pH 8.0). Cleavages were carried out in a mkture 
consisting of 5 jil beads (20 mg), 3 lOx NEB buffer #3, 3 jU NEB Fok I (4 units/jil), 
and 19 \ll distilled water. The mixture was incubated at 31^C for 30 minutes, after 
which the beads were washed 3 times in TE (pH 8.0). 

After each ligation, a sample of the beads witii the ligated complex was removed 
for size analysis on a model 373 DNA sequencer using 672 GeneScan software (Applied 
Biosystems). The readout of the system provides a different colored curve for 
fragments labeled witii die four different dyes (black for TAMRA, blue for FAM, green 

m 

for JOE, and red for ROX). A 6% denaturing (8 M urea) polyacrylamide gel was 
employed in accordance witii manufacturer's protocols. About 0.5 mg of beads were 
placed in 4 \il of formamide loading buffer in accordance with the manufacturer's 
protocol for analyzing sequencing fragments. Samples were heated to 95^0 for 2 min 
tiien cooled by placing on ice, after which tiie entire sample was loaded into one lane. 

Results of four cycles of ligation arc shown in Figures 4a tfirough 4d. Curve a 
of figure 4a demonstrates tiiat die first nucleotide in die target sequence is correctly 
identified as A. The first nucleotide is die one in the protruding strand closest to the 
double stranded portion of tiie target polynucleotide. Curves si and S2 are 172 and 186 
nucleotide size standards. The very low curves indicated by "b" in die figure show diat 
die fidelity of die ligase was very high, in tiiat litde or no other probes besides the 
correct one were ligated. Curve c in figure 4b demonstrates that the second nucleotide 
of die target polynucleotide is correctly identified as A. Note diat as in figure 4a, only 
an insignificant number probes were incorrecdy ligated. as indicated by "d". Figure 4c is 
a superposition of curve c of figure 4b onto die curve of figure 4a. This shows that 
curve c corresponds to a fragment one nucleotide shorter dian that of curve a, as 
expected after the Fok f digestion. Figure 4d is a superposition of the data on the 
fragments generated in cycles 2, 3, and 4, indicated by curves e, f, and g, respectively. 
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Again, the fidelity of ligation is very high and the peaks of the curves are in the correct 
order, as expected from the one nucleotide size reduction that takes place after each 
Fok I digestion. 

S Example 7 

Sfiqqfincjng a Tftrgpt PQlynqclffltide 
Amplified from pGEM7Z! Identification of 
Nucleotides bv a Polvmerase Extension Reaction 

10 In this example, a segment of plasmid pGEMTZ was amplified by PGR using a 

biotinylated primer and attached by the biotin to sti:epavidinated magnetic beads. After 
each cleave step, the resulting protruding strand of the target polynucleotide was used 
as a template to extend the recessed strand by one nucleotide using a DNA polymerase 
in the presence of a mixture of labeled dideoxy nucleoside triphosphates. The extended 

15 strand was then analyzed by gel electrophoresis as described above. 

The PGR reaction was prepzxed by combining the following: I \d pGEM7Z 
plasmid (1 pg/jil). 1 jil B002 biotinylated primer (100 pmoles/nl). 1 jll -337 primer (100 
pmole/(il), 20 |il 10 nucleoside triphosphates (2.3 mM stock of each triphosphate), 20 p. 
1 lOx Taq buffer (P^jdn-Elm^), 156 ^1 distilled water, and 1 pi Taq (2 units/ml). The 

20 primers had the following sequences: 

B002 : 5 ' -biotin-CCCGACGTCGCATGCTCCTCTA 

-337: 5 • -GCGCGTTGGCCGATTCATTA 

25 

Tbe above PGR mixture was cycled 25 times through the following temperatures in a 
Perkin-Ehner 9600 thermal cycler: 94^C I min, 520G 1 min, and 720G 2 rain. After 
cycUng, to the reaction mixture was added 10 ^g glycogen and 100 \xl chloroform, after 
which the aqueous phase was removed and combined with 20 ^ 3 M NaOAc and 500 \i 

30 1 ethanol. After the resulting mixture, was spun in a microfuge for 30 min, the 

precipitate was collected, dried, and resuspended in 50 ^1 H2O. Prior to combining 
with the biotinylated DNA, the strepavidinated magnetic beads (20 (U) were washed 3 
tunes with 100 ^] of 2x bead wash (1.0 M NaGl, Tris» triton X-100) and then 
resuspended in 10 ^1 of 2x bead wash. 10 ml of the biotinylated DNA solution was 

35 added to the beads and allowed to sit for 5 min with agitation, after which the beads 
were magnetically pulled to the side of the tube, the supernatant removed, and the beads 
washed twice with 2x bead wash and 3 time with water. 
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An initial protruding strand was produced at the end of the attached target 
polynucleotide by cleaving with Fok I as follows: To the beads weie added: 44 (U 
H20, 5 Jil lOx Fok I buffer (New England Biolabs), and 1 jil Fok I (New England 
Biolabs, 4 units/^l). Die mixture was incubated for 30 min, after which the supernatant 
5 was removed from the magnetic beads. After diis initial cleavage, three cycles of 
extension^ ligation* excision* and cleavage were carded out with the following 
protocols. After each extension a sample of beads were removed from the reaction 
mixture and the labeled strand of the target polynucleotide was analyzed as described in 
Example 6. 

10 Extension reactions were earned out with Sequenase DNA polymerase in the 

presence of labeled dideoxynucleosides by adding to the beads the following mixture: 
17.0 Hi H20, 5.0 5x Sequenase buffer, 2.5 |ll lOx Taq fluorescent dye-labeled 
tenninators (Peridn-Elmer), and 1.0 \A Sequenase 2.0 (13 units/(il). After incubation at 
370c for 15 min, proteinaceous material was extracted with 50 fil phenol/chlorofonn, 

15 which was then back extracted with 25 pi H2O. The combined aqueous phases were 
again extracted with 50 \il chloroform, after which the aqueous phase was removed and 
mbced with 5 Hi 3 M NaOAc and 125 Hi ethanol. The precipitate was collected, 
microfuged for 15 min, washed with 70% ethanol, and dried. 

A mixed probe was prepared as described in Example 6 with the following 

20 differences: (i) the probe is unlabeled, thus, only a single mixture need be prepared; and 
(ii) the protruding strand consisted of three nucleotides such that each of the three 
positions in the protruding strand could be A, C, G, or T, i.e. each was "N" as described 
above. Ligation was carried out as follows: To a 0.5 ml tube containing the dried DNA 
was added 20.5 jil probe (100 pmoles/pl), 2.5 fil lOx ligase buffer (New England 

25 Biolabs), 2.0 ^1 ligase (New England Biolabs, 0.4 units/Hi). The mixture was incubated 
for 1 hour at 16^C, after which the DNA was purified on a spin column prepared as 
follows: resin was swelled with 800 jil H2O for 45 min, drained, and spun at 800 rpm 
for2inin. 

The labeled terminator was excised from the ligated complex with the 3'-45' 
30 exonuclease activity of Deep Vent DNA polymerase. At the same time, the polymerase 
exti^ds the strand the length of the probe, thereby repairing the nick caused by the 
presence of the dideoxy terminator. Hie reaction was canried out in a MicroAmp tube 
(Perkin-Ehner) containing the following: 25.0 ^1 DNA, 3.5 fil lOx nucleoside 
triphosphates (1.25 mM each), 3.5 \il lOx Vent buffer (New England Biolabs), and 2.0 
35 \il Deep Vent DNA polymerase (2 units/|il). The mixtare was incubated for 60 min at 
80^C under oil, after which 15 ml H2O was added and the combined mixture was • 
extracted with 100 |ll chloroform. The aqueous phase was removed and mixed with 5 \i 
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1 3 M NaOAc and 12S pi etlianoL after which the precipitate was collected, microfiiged 
for IS min» washed with 70% etbanol, and. dried. 

Fok I cleavage was carried out by resuspending the DNA in 21.5 ^ H2O and 
adding 2.5 ^1 lOx Fok I buff» Q^ew Qigland Biolabs) and 1.0 |il Fbk I (4 units/nil). 
5 The mixture was incubated for 15 min at 37^C, after which the DNA was purified on a 
spin column prepared as described above. 

Results are shown in Figures 5a through 5c. The colors of the curves generated 
by the QeneScan software containing the dominant peaks in the figures corresponded to 
the correct nucleotide in the target polynucleotide. 

10 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Sydney Brenner 

(ii) TITLE OF INVENTION: DNA Sequencing by Stepwise 
Ligation and Cleavage 

(iii) NUMBER OF SEQUENCES: 16 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Stephen C, Macevicz, Lynx 
Therapeutics, Inc. 

(B) STREET: 3832 Bay Center Place 

(C) CITY: Hayward 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94545 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.5 inch diskette 

(B) COMPUTER: IBM coirpatible 

(C) OPERATING SYSTEM: Windows 3.1/DOS 5.0 

(D) SOFTWARE: Microsoft Word for Windows, vers, 2.0 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/222,300 

(B) FILING DATE: 04-APR-94 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/280,441 

(B) FILING DATE: 25-JUL-94 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Stephen C. Macevicz 

(B) REGISTRATION NUMBER: 30,285 

(C) REFERENCE/ DOCKET NUMBER: slc3 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 638-5552 

(B) TELEFAX: (510)670-9302 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 



AGTGAATTCG AGCTCGGT 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 



CCTTTGAGTG AGCTGATA 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 



AATTCGAGCT CGGTACCCGG GGATCCTCTA 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 



ATCGGATGAC ATCAAC 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 nucleotides 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: dOTable 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 



ACTCCTTCNN NNTACCGG 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 



AGACGCCATC ATNNNNTACC GG 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 



CAGATCCTCT TCA 



(2) INFORMATION FOR SEQ ID NO: 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 



CAGATCCTCT TC 



(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 nucleotides 

(B) TYPE: nucleic acid 

(C) STRA13DEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



CTAAACCATT GGTATGGGCC AGTGAATTGT AATA 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CGCGCAGCCC GCATCGTTTA TGCTACAGAC TGTCAGTGCA 
GCTCTCCGAT CCAAA 



(2)* INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



GNNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR' SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doiible 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TNNNTACGTG CGCATCCCGA GC 
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(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



CNNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHIO^CTERISTICS : 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ANNNTACGTG CGCATCCCGA GC 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



CCCGACGTCG CATGCTCCTC TA 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 nucleotides 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



GCGCGTTGGC CGATTCATTA 



39 
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I claim: 

L A method for determining a nucleotide sequ^ice of a polynoclM^ 
5 comprising the steps of: 

(a) ligating a probe to an end of a polynucleotide* the probe having a nuclease 
recognition site; 

(b) identifying one or more nucleotides at the end of the polynucleotide; and 

(c) cleaving the polynucleotide with a nuclease recognizing the nuclease 

10 recognition site of the probe such that the polynucleodde is shortened by one or more 
nucleotides. 

2. Ihe method of claim 1 farther including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

15 

3. The method of claim 2 wherein said nuclease is a type lis lestriction 
endonuclease. 

4. The method of claim 3 further including a step of blocking recognition sites of 
20 said nuclease on said polynucleotide. 

5. Ihe method of claim 4 wherein said step of ligating includes treating said 
polynucleotide and said probe with a ligase. 

25 6. The method of claim 5 wherein said polynucleotide has a protruding strand at at 
least one end and wherein said probe has a protruding strand at one end, tiie protruding 
strand of said probe being complementary to the protruding strand at one end of said 
polynucleotide. 

30 7. The method of claim 6 wherein said protruding strand of said polynucleotide has 
a 5*-phosphoryl group and wherein said complementary protruding strand of said probe 
lacks a S -phosphoryl group. 

8. The method of claim 7 wherein said step of ligating includes treating said 
35 polynucleotide and said probe in succession with (i) a Ugase to ligate said protruding 
strand having said 5'-phosphoryl group to said probe, (ii) a kinase to phosphorylate said 
complementary protruding strand of said probe, and a ligase to ligate said 
complementary protruding strand of said probe to said polynucleotide. 
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9. The method of claim 6 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding stiand of said probe 
includes every possible sequence of nucleotides the length of said protruding strand. 

5 

10. The method of claim 6 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 

1 1 . The method of claim 6 wherein said step of identifying includes identifying a 

1 0 nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

12. The method of claim 1 1 further including the step of capping said polynucleotide 
which fails to ligate to said probe. 

15 

13. The method of claim 12 wherein said step of capping includes extending said 
polynucleotide with a DNA polymerase in the presence of chain-terminating nucleoside 
triphosphates. 

20 14. The method of claim 1 3 wherein said chain-terminating nucleoside triphosphates 
are dideoxynucleoside triphosphates. 

15. The method of claim 6 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 

25 polynucleotide or said probe with a nucleic acid polymerase in die presence of chain- 
terminadng nucleoside triphosphates. 

1 6. The method of claim 15 wherein said §tep of identifying further includes 
extending a strand of said polynucleotide. 

30 

17. The mediod of claim 16 wherein said chain-terminating nucleoside triphosphates 
are labeled. 

18. The method of claim 15 wherein said step of identifying further includes 
35 extending a strand of said probe and where^in said chain-terminadng nucleoside 

triphosphates are labeled. 



41 



wo 95/27080 



PCT/DS95/03678 



19. The method of claim 1 wherein said polynucleotide has a protruding strand at at 
least one end and wherein said probe has a protruding strand at one end, the protruding 
strand of said probe being complementary to the protruding strand at one end of said 
polynucleotide. 

20. The method of claim 19 further including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

2L The method of claim 20 wherein said step of ligating includes treating said 
polynucleotide and said probe with a ligase. 

22. The method of claim 21 wherein said nuclease is a type Us restriction 
endonuclease. 

23. The method of claim 22 wherein said stq) of ligating includes providing said 
probe as a mixture such that said complementary protmding strand of said probe 
includes every possible sequence of nucleotides the length of said protruding strand. 

24. The method of claim 23 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 

25. The method of claim 1 wherein said step of identifying includes identifying a 
nucleotide in said proUiiding strand of said polynucleotide by extending a strand of said 
polynucleotide or said probe with a nucleic acid polymerase. 

26. The method of claim 25 further including the step of repeating said steps (a) 
through (c) until said nucleotide sequence of said polynucleotide is determined. 

27. The method of claim 26 wherein said step of identifying further includes 
extending a suand of said polynucleotide in the presence of chain-t^minating 
nucleoside triphosphates. 

28. The method of claim 27 wherein said chain-terminating nucleoside triphosphates 
are labeled. 

29. The method of claim 28 wherein said chain-terminating nucleoside triphosphates 
are labeled with fluorescent dyes. 
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30. The method of claim 29 wherein said fluorescent dyes have spectrally resolvable 
fluorescrace emission bands. 

3 1 . Hie method of claim 1 wherein said polynucleotide has a protruding strand at 

5 one end and is attached to a solid phase support by another end and wherein said probe 
has a protruding strand at one end, the protruding strand of said probe being 
Qomplementary to the protrudmg strand at one end of said polynucleotide. 

32. The method of claim 3 1 further including the step of repeating said steps (a) 
10 through (c) until said nucleotide sequence of said polynucleotide is determined. 

33. The method of claim 32 wherein said step of ligadng includes treating said 
polynucleotide and said probe vnth a ligase. 

15 34. The method of claim 33 wherein said nuclease is a type Us restriction 
endonuclease. 

35. The method of claim 34 wherein said step of ligating includes providing said 
probe as a mixture such that said complementary protruding strand of said probe 

20 includes every possible sequence of nucleotides die lengtii of said protruding strand 

36. The method of claim 35 further including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 

25 37. Hie raetiiod of claim 36 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
polynucleotide with a nucleic acid polymerase in die presence of chain-temiinating 
nucleoside triphosphates. 

30 38. A method for determining a nucleotide sequence of a polynucleotide, die method 
comprising the steps of: 

(a) ligating a probe to an end of a polynucleotide having a protruding strand to 
form a ligated complex, the probe having an end witii a complementary protruding 
strand to tiiat of the polynucleotide and the probe having a nuclease recognition site; 
35 (b) cleaving die ligated complex widi a nuclease, the nuclease recognizing tiie 

recognition site and cleaving tiie ligated compbx such tiiat an augmented probe is 
released leaving a protruding strand on die polynucleotide; 
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(c) identifying one or more nucleotides in tiie protniding strand of the 
polynucleotide; and 

(d) repeating steps (a) through (c) until the nucleotide sequence of the 
polynucleotide is determined 

5 

39. The metiiod of claim 38 wherein said nuclease is a type lis restriction 
endonuclease and wherein said polynucleotide is provided with recognition sites of said 
nuclease blocked. 

10 40. The method of claim 39 wherein said recognition sites of said polynucleotide are 
blocked with a methylase. 

41. The method of claini 39 wherein said step of ligating includes treating said 
polynucleotide widi a ligase. 

15 

42. The method of claim 41 wherem said protruding strand of said polynucleotide 
has a S*-phosphoryl group and wherein said complementary protruding strand of said 
probe lacks a S -phosphoryl group. 

20 43. The method of claim 42 wherein said step of ligating includes treating said 
polynucleotide and said probe in succession with (i) a ligase to ligate said protruding 
strand having said 5 -phosphoryl group to said probe, (ii) a kinase to phosphorylate said 
complementary protruding strand of said probe, and a ligase to ligate said 
complementary protruding strand of said probe to said poljmucleotide. 

25 

44. The method of claim 43 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

30 45. The method of claim 41 wherein said polynucleotide is attached to a solid phase 
support 

46. The method of claim 45 wherein said step of ligating includes providing said 
probe as a mixmre such that said complementary protruding strand of said probe 

35 includes every possible sequence of nucleotides the length of said protruding strand. 

47. The metiiod of claim 46 fiuther including the step of removing unligated probe 
from said polynucleotide after said step of ligating. 
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48. The method of claim 47 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by the identity of said probe 
ligated thereto. 

49. Hie method of claim 47 wherein said step of identifying includes identifying a 
nucleotide in said protruding strand of said polynucleotide by extending a strand of said 
polynucleotide or said probe with a nucleic add polymerase in the presence of chain- 
tetminating nucleoside triphosphates. 

50. The method of claim 47 wherein said solid phase support is a microparticle. 

5 1 . The method of claim 50 wherein said type lis restriction endonuclease is Fok L 

52. A method for determining a nucleotide sequence of a polynucleotide, the method 
comprising the steps of: 

(a) providing a polynucleotide in double stranded form such that the 
polynucleotide is attached to a solid phase support and has a protruding strand at one 
end; 

(b) ligating a probe to the protruding strand of the polynucleotide to f oim a 
ligated complex, the probe having an end with a complementary protruding strand to 
that of the polynucleotide and the probe having a type lis endonuclease recognition site; 

(c) identifying a nucleotide in the protruding sUrand of the polynucleotide by the 
identity of the ligated probe; 

(d) cleaving the ligated complex with a type lis endonuclease that recognizes the 
type Us endonuclease recognition site and cleaves the ligated complex such that an 
augmented probe is released leaving a new protruding strand on the polynucleotide; and 

(e) repeating steps (a) through (d) until the nucleotide sequence of the 
polynucleotide is determined. 

53. The method of claim 52 wherein said probe comprises a first single stranded 
oligonucleotide and a second single sUranded oligonucleotide, the first single suanded 
oligonucleotide having an end with complementary nucleotides to those in said 
protruding strand of said polynucleotide and the second single sdranded oligonucleotide 
being complementary to a portion of the first single stranded oligonucleotide such that 
the first and second single stranded oligonucleotides are capable of forming a duplex 
containing a type lis endonuclease recognition site, and wherein said step of ligating 
includes (i) annealing the first single stranded oligonucleotide to said protruding strand 
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of said polynucleotide under conditions that promote the formation of a perfectly 
matched duplex therebetween, (ii) ligadng the first single stranded oligonucleotide to 
said polynucleotide, (iii) annealing the second single stranded oligonucleotide to the first 
• single stranded oligonucleotide, and (iv) ligating the second single stranded 
5 oligonucleotide to said polynucleotide. 

54. The mediod of claim S3 wherein said step of ligating includes providing said first 
single stranded oligonucleotide as a mixture such that said complementary nucleotides in 
said end of said first single stranded oligonucleotide includes every possible sequence of 

10 nucleotides the length of said end. 

55. The method of claim 54 further including the step of removing unligated said 
first and second single stranded oligonucleotides from said polynucleotide after said step 
of ligating. 

15 

56. The method of claim 55 wherein said probe comprises four components, each 
component being capable of indicating the presence of a different nucleotide in said 
protruding strand of said polynucleotide upon ligation. 

20 57. The method of claim 56 wherein each of said components of said probe is 

labeled with a different fluorescent dye and the different fluorescent dyes are spectrally 
resolvable. 

58. A method for determining a nucleotide sequence of a polynucleotide, the method 
25 comprising the steps of: 

(a) providing a polynucleotide in double stranded form such that the 
polynucleotide is attached to a solid phase support and has a protruding strand and a 
recessed strand at one end; 

(b) identifying a nucleotide in the protruduig strand of the polynucleotide by 
30 extending die recessed strand with a nucleic acid polymerase; 

(c) ligating a probe to the one end of the polynucleotide, the probe having a type 
lis restriction endonuclease recognition site; 

(d) cleaving the polynucleotide with a type lis restriction endonuclease that 
recognizes the type lis endonuclease recognition site leaving a new protruding strand on 

35 the polynucleotide; and 

(e) repeating steps (a) through (d) until the nucleotide sequence of the 
polynucleotide is determined. 
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59. The method of claim 58 wheiein said nucleic acid polymerase extends said 
recessed strand in the presence of chain-temiinating nucleoside triphosphates. 

60. The method of claim 59 further including the step of removing unligated probe 
5 £rom said polynucleotide after said step of ligating. 

6 1 . The method of claim 60 further including a step of blocking recognition sites of 
said type lis restriction endonuclease on said polynucleotide, 

10 62. The method of claim 6 1 wherdn said recognition sites of said polynucleotide arc 
blocked with a methylase. 

63. The method of claim 61 wherein said probe has a 5' protruding strand one 
nucleotide less in length than said protruding strand of said polynucleotide and wherein 

15 said step of ligating includes providing said probe as a mixture such that the protruding 
strand of the probe includes every possible sequence of nucleotides the length of the 
protruding strand. 

64. The method of claim 63 wherein said chain-terminating nucleoside triphosphate 
20 is a labeled dideoxynucleoside triphosphate and wheiein said step of identifying includes 

identifying said one or more nucleotides by tiie label on die labeled dideoxynucleoside 
triphosphate incorporated into said recessed strand of said polynucleotide. 

65. The metiiod of claim 64 further including the steps of excising said labeled 
25 dideoxynucleotide and extending said recessed strand witii a nucleic acid polymerase. 

66. The method of claim 65 wherein said step of excising is carried out with T4 
DNA polymerase in die presence of deoxyribonucleoside triphosphates. 

30 67. A double stranded nucleic acid probe comprising: 

a fust oligonucleotide strand; 

a second oligonucleotide strand such that the fust and second oligonucleotide 
sUands form a perfectiy matched duplex in a duplex forming region and such that the 
second oligonucleotide strand forms a protruding strand with respect to the duplex 
35 forming region, the protruding strand including every possible sequence of nucleotides 
the length of the protruding strand; and 

a nuclease recognition site within the duplex forming region. 
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68. The double stranded nucleic acid probe of claim 67 whwein said nuclease 
recognition site is the recognition site of a type lis restriction endonuclease having a 
recognition site and a cleavage site and whoreih the cleavage site is located outside of 
said duplex fonning region. 

5 

69. A method of determining the zygosity of an individual at a prcdetennined 
genetic locus having a pluraUty of allelic forms of DN A, the method comprising the 

steps of: 

(a) providing a sample of the DNA from the predetermined genetic locus such 
10 that the sample of DNA comprises polynucleotides, each polynucleotide of die sample 

having a protruding strand and a recessed strand; 

(b) ligating a probe to an end of each polynucleotide to fonn one or more ligated 
complexes, the probe having a nuclease recognition site; 

(c) identifying tiie kind and relative abundance of nucleotides in die protruding 
1 S surand of the polynucleotide; 

(d) cleaving die ligated complexes with a nuclease; and 

(e) repeating steps (b) through (d) until tiie nucleotide sequences of tiie 
polynucleotides of the genetic locus arc detennined. 

20 70. The method of claim 69 wherein each of said polynucleotides is attached to a 
sq)arate solid phase support or a separate region of tiie same solid phase support 

7 1 . The method of claim 70 wherein said nuclease is a type lis resUiction 
endonuclease and wherein said step of identifying includes identifying a nucleotide in 

25 each of said protruding strands of said polynucleotides by extending a su^d of each of 
said polynucleotides with a nucleic acid polymerase in die presence of chain-terminating 
nucleoside triphosphates. 

72. The metiiod of claim 7 1 further including die step of removing unligated probe 
30 from said polynucleotide after said step of ligating. 

73. The metiiod of claim 72 wherein said chain-tenninating nucleoside triphosphates 
are labeled dideoxynucleoside triphosphates and wherein said step of identifying 
includes identifying said nucleotide by the label on the labeled dideoxynucleoside 

35 triphosphates incorporated into said recessed strand of said polynucleotide. 

74. The method of claim 73 furflier including tiie steps of excising said labeled 
dideoxynucleotides and extending said recessed strands witii a nucleic acid polymerase. 
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75. Th& method of claim 74 wherein said step of excising is carried out with T4 
DNA polymerase in the presence of deoxyribonucleoside triphosphates. 

5 76. A kit for determining the nucleotide sequence of a polynucleotide, die kit 
comprising: 

a probe capabb of being ligated the an end of a polynucleodde to form a ligated 
complex, the probe having a type lis restriction endonuclease recognition site; 

a type lis restricdon endonuclease capable of recognizing the type Us restriction 
10 endonuclease recognition site of the probe. 

77. The kit of claim 76 further comprising a ligasefor ligating said probe to said end 
of said polynucleotide. 

IS 78. The kit of claim 77 furdier comprising a first reaction buffer for said ligase and a 
second reaction buffer for said type lis restriction endonuclease. 

79. The kit of claim 78 further comprising: 
a nucleic acid polymerase; 
20 labeled chain-terminating nucleoside triphosphates; and 

a third reaction buffer for the nucleic acid polymerase. 



49 



wo 95^080 



FCT/nS9S/03678 



^16 



I LABEL 


1 RECOGNfnON 


1 




i SITE 





12 





14 



22 





20 




Fig. la 



WimmipSGATC? NNHNimNNHNNNNj T[ HNHNNimNNNKNN . . 
NIINNN,j:CXAC[ NNNtlHNHNHNNNN ANNNTiiJ NmmNHHN . • 



I I 

J I 



I 



24: 



26 



PROBE 



J 



AUGMENTED PROBE 



Fig. 2 



1/17 



aiBSHSiEErp£26) 



1 



wo 95/27080 PCTA)S9S/03678 



15 



Ugate 



19 



Polymerase and labeled ddhfTPs 



t 



n 



Cleave 



t 



11 



17 




I 



21 



LL 




23 



] 



Fig. lb 



2/17 



8DB5linnEMEf(SiaE26) 



wo 95/27080 



PCTAJS95/03678 



15 



1 



Pdymerase and labeled ddNTPs 



n 



Ugate 




1 ^ 




Polymerase and d^f^Ps 
(Exdse, extend & displace) 




17 



T r 




Cleave 



19 



r 



23 



r 



25 



Fig. 1c 



3/17 



SIIBSIinnESHEE[(BULE26) 



\ 



wo 9507080 PCTAI^S/03«78 



IS 



21 



3" 



Ligate 



t 



130 



5' 



3' 1^ 




134 



J — r 



132 



Extend/Identify 





17 



t 




19 



^ — I 



Cleave 



t 



23 



^ *N» ^ 




□ 



Fig. 1d 

4/17 

SUIiSnnnESI!EETP£2i)) 



PCTA7S95/03678 



15 



3' 



Extend with dATP(3'-NH2) 
050) 



152 



155 



154 



Ligate Probe 
(156) 




L A 1 






I 

T 1 






Cleave 




19 



17 



t 



Q 



An- 




23 



Fig. 1e 



5/17. 

SUBSmUIESlEI P£26) 



PCTAIS9S/03678 



WO 95/27080 



15 



3' 



Anneal/Ligate 



5' 



100 



104— J 



AnneaiAjgate 



102 




17-J 



Fig. If 



15 




3' 

□ 



Anneal/Ligate 



3' 




1Z0 



124 



Anneal/Ligate 



5' 



122 



17 



/ 



ziiEzzzzazij 



Fig. 1g 



6/17 



SIIBSnniIESilE£[(|miE28) 



wo 95^7080 PCTAJS95/03678 



34 





n 

36 



L 



i 



I 



1 



7^ 



0 10 20 30 
NUCLEOTIDES SEQUENCED 



SOUD 



PHASE 



SUPPORT 



32 



Fig. 3a 



■10 Cycles 
ligation 
cleavage 

Then cap 



38 



10 



20 



30 



NUCLEOTIDES SEQUENCED 



7/17 



77^ 



Fig. 3b 



S0BS[iniIESilEEr(B0UE26) 



I 



wo 95^7080 PCrAJS95/0367« 




siiBsiniiiEsiiEEroiiaEZQ 



PCTA)S9S/03<>78 



z. 



10 



20 



30 



NUCLEOTIDES SEQUENCED 



46 



7^ 



y/- 
y/- 



Fig. 3e 



yy 



10 



20 



30 



NUCLEOTIDES SEQUENCED 



9/17 



y. 



z. 



7^ 



7^ 



Fig. 3f 



aiBSniUSSH£Erp£26) 



PCT/US9S/a3678 




SUilSnnn£!KT(|«ll£26) 



95/27080 



PCT/US9S/03678 



-10 Cycles 
ligation 
cleavage 

-Then cap 



10 



52 





JL 



20 



30 



NUCLEOTIDES SEQUENCED 



Fig. 3h 



11/17 



SIIBSnniK SHEET (SIILE 26) 



PCrA7S95/03678 




A)|sue|U| eouaosdjonid 



12/17 



(0 



D> 

C 

0) 

-J 

a> 
■o 

o 

3. 
o 

3 



O 



suBsnniiEsiiErcBiu 



PCTAIS9S/03678 




13/17 

suBsnnnEsiEEr(niuE28) 



v/0 9smm 



PCTAJS)5An678 




A)isue|U| aouaosejonij 

14/17 

SIBSniUIESIIEEI(|HllE26) 



A)!SU9|U| aoudosdjony 

15/17 

SIIKnniIESHEErp£26) 




■o 

D) 



D) 

C 

0) 



0) 

o 

3 
C 
>. 

O 
Q. 



wo 95/27080 



PCTA^S9S/03e78 



>» J 
W 

c 
B 

0) 

u 
c 
a 
o 

(0 

g) 
o 

D 

u. 




Polynucleotide Length 



Fig. 5a 



CO 

c 

0) 



0) 

u 
c 

0) 

o 
o 

3 

u. 




Polynucleotide Length 



16/17 



Fig. 5b 



SDBSnnnESHEET(RUlE26) 



c 



o 
c 

0) 

o 

V) 

o 

3 
LL 




Polynucleotide Length 



Fig. 5c 



17/17 

S0BSniinESHEEr(IHIlE26) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

y^COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

/ 

Q LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



